http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/quick-start.html
----------------------------------------------------------------------
diff --git a/6.0/quick-start.html b/6.0/quick-start.html
new file mode 100644
index 0000000..d1b9d51
--- /dev/null
+++ b/6.0/quick-start.html
@@ -0,0 +1,251 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Quick Start</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Quick Start</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p>If you just want to use Joshua to translate data, the quickest 
way is
+to download a <a href="/language-packs/">pre-built model</a>. </p>
+
+<p>If not language pack is available, or if you have your own parallel
+data that you want to train the translation engine on, then you have
+to build your own model. This takes a bit more knowledge and effort,
+but is made easier with Joshua’s <a href="pipeline.html">pipeline script</a>,
+which runs all the steps of preparing data, aligning it, and
+extracting and tuning component models. </p>
+
+<p>Detailed information about running the pipeline can be found in
+<a href="/6.0/pipeline.html">the pipeline documentation</a>, but as a quick
+start, you can build a simple Bengali–English model by following
+these instructions.</p>
+
+<p><em>NOTE: We suggest you build models outside the <code 
class="highlighter-rouge">$JOSHUA</code> directory</em>.</p>
+
+<p>First, download the dataset:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>mkdir -p 
~/models/bn-en/
+cd ~/models/bn-en
+wget -q 
https://github.com/joshua-decoder/indian-parallel-corpora/archive/1.0.tar.gz
+tar xzf indian-parallel-corpora-1.0.tar.gz
+ln -s indian-parallel-corpora-1.0 input
+</code></pre>
+</div>
+
+<p>Then, train and test a model</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/pipeline.pl --source bn --target en \
+    --type hiero \
+    --no-prepare --aligner berkeley \
+    --corpus input/bn-en/tok/training.bn-en \
+    --tune input/bn-en/tok/dev.bn-en \
+    --test input/bn-en/tok/devtest.bn-en
+</code></pre>
+</div>
+
+<p>This will align the data with the Berkeley aligner, build a Hiero
+model, tune with MERT, decode the test sets, and reports results that
+should correspond with what you find on
+<a href="/indian-parallel-corpora/">the Indian Parallel Corpora page</a>. For
+more details, including information on the many options available with
+the pipeline script, please see <a href="pipeline.html">its documentation 
page</a>.</p>
+
+<p>Finally, you can export the full model as a language pack:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>./run-bundler.py \
+  tune/joshua.config.final \
+  language-pack-bn-en \
+  --pack-tm grammar.gz
+</code></pre>
+</div>
+
+<p>(or possibly <code 
class="highlighter-rouge">tune/1/joshua.config.final</code> if you’re using 
an older version of
+the pipeline).</p>
+
+<p>This will create a <a href="bundle.html">runnable model</a> in
+<code class="highlighter-rouge">language-pack-bn-en</code>. See the <code 
class="highlighter-rouge">README</code> file in that directory for
+information on how to run the decoder.</p>
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/quick-start.md
----------------------------------------------------------------------
diff --git a/6.0/quick-start.md b/6.0/quick-start.md
deleted file mode 100644
index 53814ae..0000000
--- a/6.0/quick-start.md
+++ /dev/null
@@ -1,59 +0,0 @@
----
-layout: default6
-title: Quick Start
----
-
-If you just want to use Joshua to translate data, the quickest way is
-to download a [pre-built model](/language-packs/). 
-
-If not language pack is available, or if you have your own parallel
-data that you want to train the translation engine on, then you have
-to build your own model. This takes a bit more knowledge and effort,
-but is made easier with Joshua's [pipeline script](pipeline.html),
-which runs all the steps of preparing data, aligning it, and
-extracting and tuning component models. 
-
-Detailed information about running the pipeline can be found in
-[the pipeline documentation](/6.0/pipeline.html), but as a quick
-start, you can build a simple Bengali--English model by following
-these instructions.
-
-*NOTE: We suggest you build models outside the `$JOSHUA` directory*.
-
-First, download the dataset:
-   
-    mkdir -p ~/models/bn-en/
-    cd ~/models/bn-en
-    wget -q 
https://github.com/joshua-decoder/indian-parallel-corpora/archive/1.0.tar.gz
-    tar xzf indian-parallel-corpora-1.0.tar.gz
-    ln -s indian-parallel-corpora-1.0 input
-
-Then, train and test a model
-
-    $JOSHUA/bin/pipeline.pl --source bn --target en \
-        --type hiero \
-        --no-prepare --aligner berkeley \
-        --corpus input/bn-en/tok/training.bn-en \
-        --tune input/bn-en/tok/dev.bn-en \
-        --test input/bn-en/tok/devtest.bn-en
-
-This will align the data with the Berkeley aligner, build a Hiero
-model, tune with MERT, decode the test sets, and reports results that
-should correspond with what you find on
-[the Indian Parallel Corpora page](/indian-parallel-corpora/). For
-more details, including information on the many options available with
-the pipeline script, please see [its documentation page](pipeline.html).
-
-Finally, you can export the full model as a language pack:
-
-    ./run-bundler.py \
-      tune/joshua.config.final \
-      language-pack-bn-en \
-      --pack-tm grammar.gz
-      
-(or possibly `tune/1/joshua.config.final` if you're using an older version of
-the pipeline).
-
-This will create a [runnable model](bundle.html) in
-`language-pack-bn-en`. See the `README` file in that directory for
-information on how to run the decoder.

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/server.html
----------------------------------------------------------------------
diff --git a/6.0/server.html b/6.0/server.html
new file mode 100644
index 0000000..07df127
--- /dev/null
+++ b/6.0/server.html
@@ -0,0 +1,218 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Server mode</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Server mode</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p>The Joshua decoder can be run as a TCP/IP server instead of a 
POSIX-style command-line tool. Clients can concurrently connect to a socket and 
receive a set of newline-separated outputs for a set of newline-separated 
inputs.</p>
+
+<p>Threading takes place both within and across requests.  Threads from the 
decoder pool are assigned in round-robin manner across requests, preventing 
starvation.</p>
+
+<h1 id="invoking-the-server">Invoking the server</h1>
+
+<p>A running server is configured at invokation time. To start in server mode, 
run <code class="highlighter-rouge">joshua-decoder</code> with the option <code 
class="highlighter-rouge">-server-port [PORT]</code>. Additionally, the server 
can be configured in the same ways as when using the 
command-line-functionality.</p>
+
+<p>E.g.,</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/joshua-decoder -server-port 10101 
-mark-oovs false -output-format "%s" -threads 10
+</code></pre>
+</div>
+
+<h2 id="using-the-server">Using the server</h2>
+
+<p>To test that the server is working, a set of inputs can be sent to the 
server from the command line. </p>
+
+<p>The server, as configured in the example above, will then respond to 
requests on port 10101.  You can test it out with the <code 
class="highlighter-rouge">nc</code> utility:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>wget -qO - 
http://cs.jhu.edu/~post/files/pg1023.txt | head -132 | tail -11 | nc localhost 
10101
+</code></pre>
+</div>
+
+<p>Since no model was loaded, this will just return the text to you as sent to 
the server.</p>
+
+<p>The <code class="highlighter-rouge">-server-port</code> option can also be 
used when creating a <a href="bundle.html">bundled configuration</a> that will 
be run in server mode.</p>
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/server.md
----------------------------------------------------------------------
diff --git a/6.0/server.md b/6.0/server.md
deleted file mode 100644
index f3d8da5..0000000
--- a/6.0/server.md
+++ /dev/null
@@ -1,30 +0,0 @@
----
-layout: default6
-category: links
-title: Server mode
----
-
-The Joshua decoder can be run as a TCP/IP server instead of a POSIX-style 
command-line tool. Clients can concurrently connect to a socket and receive a 
set of newline-separated outputs for a set of newline-separated inputs.
-
-Threading takes place both within and across requests.  Threads from the 
decoder pool are assigned in round-robin manner across requests, preventing 
starvation.
-
-
-# Invoking the server
-
-A running server is configured at invokation time. To start in server mode, 
run `joshua-decoder` with the option `-server-port [PORT]`. Additionally, the 
server can be configured in the same ways as when using the 
command-line-functionality.
-
-E.g.,
-
-    $JOSHUA/bin/joshua-decoder -server-port 10101 -mark-oovs false 
-output-format "%s" -threads 10
-
-## Using the server
-
-To test that the server is working, a set of inputs can be sent to the server 
from the command line. 
-
-The server, as configured in the example above, will then respond to requests 
on port 10101.  You can test it out with the `nc` utility:
-
-    wget -qO - http://cs.jhu.edu/~post/files/pg1023.txt | head -132 | tail -11 
| nc localhost 10101
-
-Since no model was loaded, this will just return the text to you as sent to 
the server.
-
-The `-server-port` option can also be used when creating a [bundled 
configuration](bundle.html) that will be run in server mode.

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/thrax.html
----------------------------------------------------------------------
diff --git a/6.0/thrax.html b/6.0/thrax.html
new file mode 100644
index 0000000..dd5e841
--- /dev/null
+++ b/6.0/thrax.html
@@ -0,0 +1,199 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Grammar extraction with Thrax</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Grammar extraction with Thrax</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p>One day, this will hold Thrax documentation, including how to 
use Thrax, how to do grammar
+filtering, and details on the configuration file options.  It will also 
include details about our
+experience setting up and maintaining Hadoop cluster installations, knowledge 
wrought of hard-fought
+sweat and tears.</p>
+
+<p>In the meantime, please bother <a href="http://cs.jhu.edu/~jonny/";>Jonny 
Weese</a> if there is something you
+need to do that you don’t understand.  You might also be able to dig up some 
information <a href="http://cs.jhu.edu/~jonny/thrax/";>on the old
+Thrax page</a>.</p>
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/thrax.md
----------------------------------------------------------------------
diff --git a/6.0/thrax.md b/6.0/thrax.md
deleted file mode 100644
index dbcc71c..0000000
--- a/6.0/thrax.md
+++ /dev/null
@@ -1,14 +0,0 @@
----
-layout: default6
-category: advanced
-title: Grammar extraction with Thrax
----
-
-One day, this will hold Thrax documentation, including how to use Thrax, how 
to do grammar
-filtering, and details on the configuration file options.  It will also 
include details about our
-experience setting up and maintaining Hadoop cluster installations, knowledge 
wrought of hard-fought
-sweat and tears.
-
-In the meantime, please bother [Jonny Weese](http://cs.jhu.edu/~jonny/) if 
there is something you
-need to do that you don't understand.  You might also be able to dig up some 
information [on the old
-Thrax page](http://cs.jhu.edu/~jonny/thrax/).

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/tms.html
----------------------------------------------------------------------
diff --git a/6.0/tms.html b/6.0/tms.html
new file mode 100644
index 0000000..f77fb26
--- /dev/null
+++ b/6.0/tms.html
@@ -0,0 +1,312 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Building Translation Models</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Building Translation Models</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <h1 id="build-a-translation-model">Build a translation model</h1>
+
+<p>Extracting a grammar from a large amount of data is a multi-step process. 
The first requirement is parallel data. The Europarl, Call Home, and Fisher 
corpora all contain parallel translations of Spanish and English sentences.</p>
+
+<p>We will copy (or symlink) the parallel source text files in a subdirectory 
called <code class="highlighter-rouge">input/</code>.</p>
+
+<p>Then, we concatenate all the training files on each side. The pipeline 
script normally does tokenization and normalization, but in this instance we 
have a custom tokenizer we need to apply to the source side, so we have to do 
it manually and then skip that step using the <code 
class="highlighter-rouge">pipeline.pl</code> option <code 
class="highlighter-rouge">--first-step alignment</code>.</p>
+
+<ul>
+  <li>
+    <p>to tokenize the English data, do</p>
+
+    <table>
+      <tbody>
+        <tr>
+          <td>cat callhome.en europarl.en fisher.en &gt; all.en</td>
+          <td>$JOSHUA/scripts/training/normalize-punctuation.pl en</td>
+          <td>$JOSHUA/scripts/training/penn-treebank-tokenizer.perl</td>
+          <td>$JOSHUA/scripts/lowercase.perl &gt; all.norm.tok.lc.en</td>
+        </tr>
+      </tbody>
+    </table>
+  </li>
+</ul>
+
+<p>The same can be done for the Spanish side of the input data:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>cat callhome.es 
europarl.es fisher.es &gt; all.es | 
$JOSHUA/scripts/training/normalize-punctuation.pl es | 
$JOSHUA/scripts/training/penn-treebank-tokenizer.perl | 
$JOSHUA/scripts/lowercase.perl &gt; all.norm.tok.lc.es
+</code></pre>
+</div>
+
+<p>By the way, an alternative tokenizer is a Twitter tokenizer found in the <a 
href="http://github.com/vandurme/jerboa";>Jerboa</a> project.</p>
+
+<p>The final step in the training data preparation is to remove all examples 
in which either of the language sides is a blank line.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>paste 
all.norm.tok.lc.es all.norm.tok.lc.en | grep -Pv "^\t|\t$" \
+  | ./splittabs.pl all.norm.tok.lc.noblanks.es all.norm.tok.lc.noblanks.en
+</code></pre>
+</div>
+
+<p>contents of <code class="highlighter-rouge">splittabls.pl</code> by Matt 
Post:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span 
class="c1">#!/usr/bin/perl</span>
+
+<span class="c1"># splits on tab, printing respective chunks to the list of 
files given</span>
+<span class="c1"># as script arguments</span>
+
+<span class="k">use</span> <span class="nv">FileHandle</span><span 
class="p">;</span>
+
+<span class="k">my</span> <span class="nv">@fh</span><span class="p">;</span>
+<span class="vg">$|</span> <span class="o">=</span> <span 
class="mi">1</span><span class="p">;</span>   <span class="c1"># don't buffer 
output</span>
+
+<span class="k">if</span> <span class="p">(</span><span 
class="nv">@ARGV</span> <span class="o">&lt;</span> <span 
class="mi">0</span><span class="p">)</span> <span class="p">{</span>
+  <span class="k">print</span> <span class="s">"Usage: splittabs.pl &lt; 
tabbed-file\n"</span><span class="p">;</span>
+  <span class="nb">exit</span><span class="p">;</span>
+<span class="p">}</span>
+
+<span class="k">my</span> <span class="nv">@fh</span> <span class="o">=</span> 
<span class="nb">map</span> <span class="p">{</span> <span 
class="nv">get_filehandle</span><span class="p">(</span><span 
class="nv">$_</span><span class="p">)</span> <span class="p">}</span> <span 
class="nv">@ARGV</span><span class="p">;</span>
+<span class="nv">@ARGV</span> <span class="o">=</span> <span 
class="p">();</span>
+
+<span class="k">while</span> <span class="p">(</span><span class="k">my</span> 
<span class="nv">$line</span> <span class="o">=</span> <span 
class="o">&lt;&gt;</span><span class="p">)</span> <span class="p">{</span>
+  <span class="nb">chomp</span><span class="p">(</span><span 
class="nv">$line</span><span class="p">);</span>
+  <span class="k">my</span> <span class="p">(</span><span 
class="nv">@fields</span><span class="p">)</span> <span class="o">=</span> 
<span class="nb">split</span><span class="p">(</span><span 
class="sr">/\t/</span><span class="p">,</span><span 
class="nv">$line</span><span class="p">,</span><span class="nb">scalar</span> 
<span class="nv">@fh</span><span class="p">);</span>
+
+  <span class="nb">map</span> <span class="p">{</span> <span 
class="k">print</span> <span class="p">{</span><span class="nv">$fh</span><span 
class="p">[</span><span class="nv">$_</span><span class="p">]}</span> <span 
class="s">"$fields[$_]\n"</span> <span class="p">}</span> <span 
class="p">(</span><span class="mi">0</span><span class="o">..</span><span 
class="nv">$#fields</span><span class="p">);</span>
+<span class="p">}</span>
+
+<span class="k">sub </span><span class="nf">get_filehandle</span> <span 
class="p">{</span>
+    <span class="k">my</span> <span class="nv">$file</span> <span 
class="o">=</span> <span class="nb">shift</span><span class="p">;</span>
+
+    <span class="k">if</span> <span class="p">(</span><span 
class="nv">$file</span> <span class="ow">eq</span> <span 
class="s">"-"</span><span class="p">)</span> <span class="p">{</span>
+        <span class="k">return</span> <span class="o">*</span><span 
class="bp">STDOUT</span><span class="p">;</span>
+    <span class="p">}</span> <span class="k">else</span> <span 
class="p">{</span>
+        <span class="nb">local</span> <span class="o">*</span><span 
class="nv">FH</span><span class="p">;</span>
+        <span class="nb">open</span> <span class="nv">FH</span><span 
class="p">,</span> <span class="s">"&gt;$file"</span> <span 
class="ow">or</span> <span class="nb">die</span> <span class="s">"can't open 
'$file' for writing"</span><span class="p">;</span>
+        <span class="k">return</span> <span class="o">*</span><span 
class="nv">FH</span><span class="p">;</span>
+    <span class="p">}</span>
+<span class="p">}</span>
+</code></pre>
+</div>
+
+<p>Now we can run the pipeline to extract the grammar. Run the following 
script:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span 
class="c">#!/bin/bash</span>
+
+<span class="c"># this creates a grammar</span>
+
+<span class="c"># NEED:</span>
+<span class="c"># pair</span>
+<span class="c"># type</span>
+
+<span class="nb">set</span> -u
+
+<span class="nv">pair</span><span class="o">=</span>es-en
+<span class="nb">type</span><span class="o">=</span>hiero
+
+<span class="c">#. ~/.bashrc</span>
+
+<span class="c">#basedir=$(pwd)</span>
+
+<span class="nv">dir</span><span class="o">=</span>grammar-<span 
class="nv">$pair</span>-<span class="nv">$type</span>
+
+<span class="o">[[</span> ! -d <span class="nv">$dir</span> <span 
class="o">]]</span> <span class="o">&amp;&amp;</span> mkdir -p <span 
class="nv">$dir</span>
+<span class="nb">cd</span> <span class="nv">$dir</span>
+
+<span class="nb">source</span><span class="o">=</span><span 
class="k">$(</span><span class="nb">echo</span> <span class="nv">$pair</span> | 
cut -d- -f 1<span class="k">)</span>
+<span class="nv">target</span><span class="o">=</span><span 
class="k">$(</span><span class="nb">echo</span> <span class="nv">$pair</span> | 
cut -d- -f 2<span class="k">)</span>
+
+<span class="nv">$JOSHUA</span>/scripts/training/pipeline.pl <span 
class="se">\</span>
+  --source <span class="nv">$source</span> <span class="se">\</span>
+  --target <span class="nv">$target</span> <span class="se">\</span>
+  --corpus 
/home/hltcoe/lorland/expts/scale12/model1/input/all.norm.tok.lc.noblanks <span 
class="se">\</span>
+  --type <span class="nv">$type</span> <span class="se">\</span>
+  --joshua-mem 100g <span class="se">\</span>
+  --no-prepare <span class="se">\</span>
+  --first-step align <span class="se">\</span>
+  --last-step thrax <span class="se">\</span>
+  --hadoop <span class="nv">$HADOOP</span> <span class="se">\</span>
+  --threads 8 <span class="se">\</span>
+</code></pre>
+</div>
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/tms.md
----------------------------------------------------------------------
diff --git a/6.0/tms.md b/6.0/tms.md
deleted file mode 100644
index 7ce5e9d..0000000
--- a/6.0/tms.md
+++ /dev/null
@@ -1,106 +0,0 @@
----
-layout: default6
-category: advanced
-title: Building Translation Models
----
-
-# Build a translation model
-
-Extracting a grammar from a large amount of data is a multi-step process. The 
first requirement is parallel data. The Europarl, Call Home, and Fisher corpora 
all contain parallel translations of Spanish and English sentences.
-
-We will copy (or symlink) the parallel source text files in a subdirectory 
called `input/`.
-
-Then, we concatenate all the training files on each side. The pipeline script 
normally does tokenization and normalization, but in this instance we have a 
custom tokenizer we need to apply to the source side, so we have to do it 
manually and then skip that step using the `pipeline.pl` option `--first-step 
alignment`.
-
-* to tokenize the English data, do
-
-    cat callhome.en europarl.en fisher.en > all.en | 
$JOSHUA/scripts/training/normalize-punctuation.pl en | 
$JOSHUA/scripts/training/penn-treebank-tokenizer.perl | 
$JOSHUA/scripts/lowercase.perl > all.norm.tok.lc.en
-
-The same can be done for the Spanish side of the input data:
-
-    cat callhome.es europarl.es fisher.es > all.es | 
$JOSHUA/scripts/training/normalize-punctuation.pl es | 
$JOSHUA/scripts/training/penn-treebank-tokenizer.perl | 
$JOSHUA/scripts/lowercase.perl > all.norm.tok.lc.es
-
-By the way, an alternative tokenizer is a Twitter tokenizer found in the 
[Jerboa](http://github.com/vandurme/jerboa) project.
-
-The final step in the training data preparation is to remove all examples in 
which either of the language sides is a blank line.
-
-    paste all.norm.tok.lc.es all.norm.tok.lc.en | grep -Pv "^\t|\t$" \
-      | ./splittabs.pl all.norm.tok.lc.noblanks.es all.norm.tok.lc.noblanks.en
-
-contents of `splittabls.pl` by Matt Post:
-
-    #!/usr/bin/perl
-
-    # splits on tab, printing respective chunks to the list of files given
-    # as script arguments
-
-    use FileHandle;
-
-    my @fh;
-    $| = 1;   # don't buffer output
-
-    if (@ARGV < 0) {
-      print "Usage: splittabs.pl < tabbed-file\n";
-      exit;
-    }
-
-    my @fh = map { get_filehandle($_) } @ARGV;
-    @ARGV = ();
-
-    while (my $line = <>) {
-      chomp($line);
-      my (@fields) = split(/\t/,$line,scalar @fh);
-
-      map { print {$fh[$_]} "$fields[$_]\n" } (0..$#fields);
-    }
-
-    sub get_filehandle {
-        my $file = shift;
-
-        if ($file eq "-") {
-            return *STDOUT;
-        } else {
-            local *FH;
-            open FH, ">$file" or die "can't open '$file' for writing";
-            return *FH;
-        }
-    }
-
-Now we can run the pipeline to extract the grammar. Run the following script:
-
-    #!/bin/bash
-
-    # this creates a grammar
-
-    # NEED:
-    # pair
-    # type
-
-    set -u
-
-    pair=es-en
-    type=hiero
-
-    #. ~/.bashrc
-
-    #basedir=$(pwd)
-
-    dir=grammar-$pair-$type
-
-    [[ ! -d $dir ]] && mkdir -p $dir
-    cd $dir
-
-    source=$(echo $pair | cut -d- -f 1)
-    target=$(echo $pair | cut -d- -f 2)
-
-    $JOSHUA/scripts/training/pipeline.pl \
-      --source $source \
-      --target $target \
-      --corpus 
/home/hltcoe/lorland/expts/scale12/model1/input/all.norm.tok.lc.noblanks \
-      --type $type \
-      --joshua-mem 100g \
-      --no-prepare \
-      --first-step align \
-      --last-step thrax \
-      --hadoop $HADOOP \
-      --threads 8 \

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/tutorial.html
----------------------------------------------------------------------
diff --git a/6.0/tutorial.html b/6.0/tutorial.html
new file mode 100644
index 0000000..6302461
--- /dev/null
+++ b/6.0/tutorial.html
@@ -0,0 +1,407 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Pipeline tutorial</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Pipeline tutorial</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p>This document will walk you through using the pipeline in a 
variety of scenarios. Once you’ve gained a
+sense for how the pipeline works, you can consult the <a 
href="pipeline.html">pipeline page</a> for a number of
+other options available in the pipeline.</p>
+
+<h2 id="download-and-setup">Download and Setup</h2>
+
+<p>Download and install Joshua as described on the <a href="index.html">quick 
start page</a>, installing it under
+<code class="highlighter-rouge">~/code/</code>. Once you’ve done that, you 
should make sure you have the following environment variable set:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>export 
JOSHUA=$HOME/code/joshua-v6.0.5
+export JAVA_HOME=/usr/java/default
+</code></pre>
+</div>
+
+<p>If you have a Hadoop installation, make sure you’ve set <code 
class="highlighter-rouge">$HADOOP</code> to point to it. For example, if the 
<code class="highlighter-rouge">hadoop</code> command is in <code 
class="highlighter-rouge">/usr/bin</code>,
+you should type</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>export HADOOP=/usr
+</code></pre>
+</div>
+
+<p>Joshua will find the binary and use it to submit to your hadoop cluster. If 
you don’t have one, just
+make sure that HADOOP is unset, and Joshua will roll one out for you and run 
it in
+<a 
href="https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html";>standalone 
mode</a>. </p>
+
+<h2 id="a-basic-pipeline-run">A basic pipeline run</h2>
+
+<p>For today’s experiments, we’ll be building a Spanish–English system 
using data included in the
+<a href="/data/fisher-callhome-corpus/">Fisher and CALLHOME translation 
corpus</a>. This
+data was collected by translating transcribed speech from previous LDC 
releases.</p>
+
+<p>Download the data and install it somewhere:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>cd ~/data
+wget --no-check -O fisher-callhome-corpus.zip 
https://github.com/joshua-decoder/fisher-callhome-corpus/archive/master.zip
+unzip fisher-callhome-corpus.zip
+</code></pre>
+</div>
+
+<p>Then define the environment variable <code 
class="highlighter-rouge">$FISHER</code> to point to it:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>cd 
~/data/fisher-callhome-corpus-master
+export FISHER=$(pwd)
+</code></pre>
+</div>
+
+<h3 id="preparing-the-data">Preparing the data</h3>
+
+<p>Inside the tarball is the Fisher and CALLHOME Spanish–English data, which 
includes Kaldi-provided
+ASR output and English translations on the Fisher and CALLHOME  dataset 
transcriptions. Because of
+licensing restrictions, we cannot distribute the Spanish transcripts, but if 
you have an LDC site
+license, a script is provided to build them. You can type:</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>./bin/build_fisher.sh 
/export/common/data/corpora/LDC/LDC2010T04
+</code></pre>
+</div>
+
+<p>Where the first argument is the path to your LDC data release. This will 
create the files in <code class="highlighter-rouge">corpus/ldc</code>.</p>
+
+<p>In <code class="highlighter-rouge">$FISHER/corpus</code>, there are a set 
of parallel directories for LDC transcripts (<code 
class="highlighter-rouge">ldc</code>), ASR output
+(<code class="highlighter-rouge">asr</code>), oracle ASR output (<code 
class="highlighter-rouge">oracle</code>), and ASR lattice output (<code 
class="highlighter-rouge">plf</code>). The files look like this:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>$ ls corpus/ldc
+callhome_devtest.en  fisher_dev2.en.2  fisher_dev.en.2   fisher_test.en.2
+callhome_evltest.en  fisher_dev2.en.3  fisher_dev.en.3   fisher_test.en.3
+callhome_train.en    fisher_dev2.es    fisher_dev.es     fisher_test.es
+fisher_dev2.en.0     fisher_dev.en.0   fisher_test.en.0  fisher_train.en
+fisher_dev2.en.1     fisher_dev.en.1   fisher_test.en.1  fisher_train.es
+</code></pre>
+</div>
+
+<p>If you don’t have the LDC transcripts, you can use the data in <code 
class="highlighter-rouge">corpus/asr</code> instead. We will now use
+this data to build our own Spanish–English model using Joshua’s 
pipeline.</p>
+
+<h3 id="run-the-pipeline">Run the pipeline</h3>
+
+<p>Create an experiments directory for containing your first experiment. 
<em>Note: it’s important that
+this <strong>not</strong> be inside your <code 
class="highlighter-rouge">$JOSHUA</code> directory</em>.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>mkdir 
~/expts/joshua
+cd ~/expts/joshua
+</code></pre>
+</div>
+
+<p>We will now create the baseline run, using a particular directory structure 
for experiments that
+will allow us to take advantage of scripts provided with Joshua for displaying 
the results of many
+related experiments. Because this can take quite some time to run, we are 
going to reduce the model
+by quite a bit by 
+restriction: Joshua will only use sentences in the training sets with ten or 
fewer words on either
+side (Spanish or English):</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>cd ~/expts/joshua
+$JOSHUA/bin/pipeline.pl           \
+  --rundir 1                      \
+  --readme "Baseline Hiero run"   \
+  --source es                     \
+  --target en                     \
+  --type hiero                    \
+  --corpus $FISHER/corpus/ldc/fisher_train \
+  --tune $FISHER/corpus/ldc/fisher_dev \
+  --test $FISHER/corpus/ldc/fisher_dev2 \
+  --maxlen 10 \
+  --lm-order 3
+</code></pre>
+</div>
+
+<p>This will start the pipeline building a Spanish–English translation 
system constructed from the
+training data and a dictionary, tuned against dev, and tested against devtest. 
It will use the
+default values for most of the pipeline: <a 
href="https://code.google.com/p/giza-pp/";>GIZA++</a> for alignment,
+KenLM’s <code class="highlighter-rouge">lmplz</code> for building the 
language model, Z-MERT for tuning, KenLM with left-state
+minimization for representing LM state in the decoder, and so on. We change 
the order of the n-gram
+model to 3 (from its default of 5) because there is not enough data to build a 
5-gram LM.</p>
+
+<p>A few notes:</p>
+
+<ul>
+  <li>
+    <p>This will likely take many hours to run, especially if you don’t have 
a Hadoop cluster.</p>
+  </li>
+  <li>
+    <p>If you are running on Mac OS X, KenLM’s <code 
class="highlighter-rouge">lmplz</code> will not build due to the absence of 
static
+libraries. In that case, you should add the flag <code 
class="highlighter-rouge">--lm-gen srilm</code> (recommended, if SRILM is
+installed) or <code class="highlighter-rouge">--lm-gen berkeleylm</code>.</p>
+  </li>
+</ul>
+
+<h3 id="variations">Variations</h3>
+
+<p>Once that is finished, you will have a baseline model. From there, you 
might wish to try variations
+of the baseline model. Here are some examples of what you could vary:</p>
+
+<ul>
+  <li>
+    <p>Build an SAMT model (<code class="highlighter-rouge">--type 
samt</code>), GKHM model (<code class="highlighter-rouge">--type ghkm</code>), 
or phrasal ITG model (<code class="highlighter-rouge">--type phrasal</code>) 
</p>
+  </li>
+  <li>
+    <p>Use the Berkeley aligner instead of GIZA++ (<code 
class="highlighter-rouge">--aligner berkeley</code>)</p>
+  </li>
+  <li>
+    <p>Build the language model with BerkeleyLM (<code 
class="highlighter-rouge">--lm-gen srilm</code>) instead of KenLM (the 
default)</p>
+  </li>
+  <li>
+    <p>Change the order of the LM from the default of 5 (<code 
class="highlighter-rouge">--lm-order 4</code>)</p>
+  </li>
+  <li>
+    <p>Tune with MIRA instead of MERT (<code class="highlighter-rouge">--tuner 
mira</code>). This requires that Moses is installed.</p>
+  </li>
+  <li>
+    <p>Decode with a wider beam (<code class="highlighter-rouge">--joshua-args 
'-pop-limit 200'</code>) (the default is 100)</p>
+  </li>
+  <li>
+    <p>Add the provided BN-EN dictionary to the training data (add another 
<code class="highlighter-rouge">--corpus</code> line, e.g., <code 
class="highlighter-rouge">--corpus $FISHER/bn-en/dict.bn-en</code>)</p>
+  </li>
+</ul>
+
+<p>To do this, we will create new runs that partially reuse the results of 
previous runs. This is
+possible by doing two things: (1) incrementing the run directory and providing 
an updated README
+note; (2) telling the pipeline which of the many steps of the pipeline to 
begin at; and (3)
+providing the needed dependencies.</p>
+
+<h1 id="a-second-run">A second run</h1>
+
+<p>Let’s begin by changing the tuner, to see what effect that has. To do so, 
we change the run
+directory, tell the pipeline to start at the tuning step, and provide the 
needed dependencies:</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/pipeline.pl           \
+  --rundir 2                      \
+  --readme "Tuning with MIRA"     \
+  --source bn                     \
+  --target en                     \
+  --corpus $FISHER/bn-en/tok/training.bn-en \
+  --tune $FISHER/bn-en/tok/dev.bn-en        \
+  --test $FISHER/bn-en/tok/devtest.bn-en    \
+  --first-step tune \
+  --tuner mira \
+  --grammar 1/grammar.gz \
+  --no-corpus-lm \
+  --lmfile 1/lm.gz
+</code></pre>
+</div>
+
+<p>Here, we have essentially the same invocation, but we have told the 
pipeline to use a different
+ MIRA, to start with tuning, and have provided it with the language model file 
and grammar it needs
+ to execute the tuning step. </p>
+
+<p>Note that we have also told it not to build a language model. This is 
necessary because the
+ pipeline always builds an LM on the target side of the training data, if 
provided, but we are
+ supplying the language model that was already built. We could equivalently 
have removed the
+ <code class="highlighter-rouge">--corpus</code> line.</p>
+
+<h2 id="changing-the-model-type">Changing the model type</h2>
+
+<p>Let’s compare the Hiero model we’ve already built to an SAMT model. We 
have to reextract the
+grammar, but can reuse the alignments and the language model:</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/pipeline.pl           \
+  --rundir 3                      \
+  --readme "Baseline SAMT model"  \
+  --source bn                     \
+  --target en                     \
+  --corpus $FISHER/bn-en/tok/training.bn-en \
+  --tune $FISHER/bn-en/tok/dev.bn-en        \
+  --test $FISHER/bn-en/tok/devtest.bn-en    \
+  --alignment 1/alignments/training.align   \
+  --first-step parse \
+  --no-corpus-lm \
+  --lmfile 1/lm.gz
+</code></pre>
+</div>
+
+<p>See <a href="pipeline.html#steps">the pipeline script page</a> for a list 
of all the steps.</p>
+
+<h2 id="analyzing-the-results">Analyzing the results</h2>
+
+<p>We now have three runs, in subdirectories 1, 2, and 3. We can display 
summary results from them
+using the <code 
class="highlighter-rouge">$JOSHUA/scripts/training/summarize.pl</code> 
script.</p>
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/tutorial.md
----------------------------------------------------------------------
diff --git a/6.0/tutorial.md b/6.0/tutorial.md
deleted file mode 100644
index 482162f..0000000
--- a/6.0/tutorial.md
+++ /dev/null
@@ -1,187 +0,0 @@
----
-layout: default6
-category: links
-title: Pipeline tutorial
----
-
-This document will walk you through using the pipeline in a variety of 
scenarios. Once you've gained a
-sense for how the pipeline works, you can consult the [pipeline 
page](pipeline.html) for a number of
-other options available in the pipeline.
-
-## Download and Setup
-
-Download and install Joshua as described on the [quick start 
page](index.html), installing it under
-`~/code/`. Once you've done that, you should make sure you have the following 
environment variable set:
-
-    export JOSHUA=$HOME/code/joshua-v{{ site.data.joshua.release_version }}
-    export JAVA_HOME=/usr/java/default
-
-If you have a Hadoop installation, make sure you've set `$HADOOP` to point to 
it. For example, if the `hadoop` command is in `/usr/bin`,
-you should type
-
-    export HADOOP=/usr
-
-Joshua will find the binary and use it to submit to your hadoop cluster. If 
you don't have one, just
-make sure that HADOOP is unset, and Joshua will roll one out for you and run 
it in
-[standalone 
mode](https://hadoop.apache.org/docs/r1.2.1/single_node_setup.html). 
-
-## A basic pipeline run
-
-For today's experiments, we'll be building a Spanish--English system using 
data included in the
-[Fisher and CALLHOME translation corpus](/data/fisher-callhome-corpus/). This
-data was collected by translating transcribed speech from previous LDC 
releases.
-
-Download the data and install it somewhere:
-
-    cd ~/data
-    wget --no-check -O fisher-callhome-corpus.zip 
https://github.com/joshua-decoder/fisher-callhome-corpus/archive/master.zip
-    unzip fisher-callhome-corpus.zip
-
-Then define the environment variable `$FISHER` to point to it:
-
-    cd ~/data/fisher-callhome-corpus-master
-    export FISHER=$(pwd)
-    
-### Preparing the data
-
-Inside the tarball is the Fisher and CALLHOME Spanish--English data, which 
includes Kaldi-provided
-ASR output and English translations on the Fisher and CALLHOME  dataset 
transcriptions. Because of
-licensing restrictions, we cannot distribute the Spanish transcripts, but if 
you have an LDC site
-license, a script is provided to build them. You can type:
-
-    ./bin/build_fisher.sh /export/common/data/corpora/LDC/LDC2010T04
-
-Where the first argument is the path to your LDC data release. This will 
create the files in `corpus/ldc`.
-
-In `$FISHER/corpus`, there are a set of parallel directories for LDC 
transcripts (`ldc`), ASR output
-(`asr`), oracle ASR output (`oracle`), and ASR lattice output (`plf`). The 
files look like this:
-
-    $ ls corpus/ldc
-    callhome_devtest.en  fisher_dev2.en.2  fisher_dev.en.2   fisher_test.en.2
-    callhome_evltest.en  fisher_dev2.en.3  fisher_dev.en.3   fisher_test.en.3
-    callhome_train.en    fisher_dev2.es    fisher_dev.es     fisher_test.es
-    fisher_dev2.en.0     fisher_dev.en.0   fisher_test.en.0  fisher_train.en
-    fisher_dev2.en.1     fisher_dev.en.1   fisher_test.en.1  fisher_train.es
-
-If you don't have the LDC transcripts, you can use the data in `corpus/asr` 
instead. We will now use
-this data to build our own Spanish--English model using Joshua's pipeline.
-    
-### Run the pipeline
-
-Create an experiments directory for containing your first experiment. *Note: 
it's important that
-this **not** be inside your `$JOSHUA` directory*.
-
-    mkdir ~/expts/joshua
-    cd ~/expts/joshua
-    
-We will now create the baseline run, using a particular directory structure 
for experiments that
-will allow us to take advantage of scripts provided with Joshua for displaying 
the results of many
-related experiments. Because this can take quite some time to run, we are 
going to reduce the model
-by quite a bit by 
-restriction: Joshua will only use sentences in the training sets with ten or 
fewer words on either
-side (Spanish or English):
-
-    cd ~/expts/joshua
-    $JOSHUA/bin/pipeline.pl           \
-      --rundir 1                      \
-      --readme "Baseline Hiero run"   \
-      --source es                     \
-      --target en                     \
-      --type hiero                    \
-      --corpus $FISHER/corpus/ldc/fisher_train \
-      --tune $FISHER/corpus/ldc/fisher_dev \
-      --test $FISHER/corpus/ldc/fisher_dev2 \
-      --maxlen 10 \
-      --lm-order 3
-      
-This will start the pipeline building a Spanish--English translation system 
constructed from the
-training data and a dictionary, tuned against dev, and tested against devtest. 
It will use the
-default values for most of the pipeline: 
[GIZA++](https://code.google.com/p/giza-pp/) for alignment,
-KenLM's `lmplz` for building the language model, Z-MERT for tuning, KenLM with 
left-state
-minimization for representing LM state in the decoder, and so on. We change 
the order of the n-gram
-model to 3 (from its default of 5) because there is not enough data to build a 
5-gram LM.
-
-A few notes:
-
-- This will likely take many hours to run, especially if you don't have a 
Hadoop cluster.
-
-- If you are running on Mac OS X, KenLM's `lmplz` will not build due to the 
absence of static
-  libraries. In that case, you should add the flag `--lm-gen srilm` 
(recommended, if SRILM is
-  installed) or `--lm-gen berkeleylm`.
-
-### Variations
-
-Once that is finished, you will have a baseline model. From there, you might 
wish to try variations
-of the baseline model. Here are some examples of what you could vary:
-
-- Build an SAMT model (`--type samt`), GKHM model (`--type ghkm`), or phrasal 
ITG model (`--type phrasal`) 
-   
-- Use the Berkeley aligner instead of GIZA++ (`--aligner berkeley`)
-   
-- Build the language model with BerkeleyLM (`--lm-gen srilm`) instead of KenLM 
(the default)
-
-- Change the order of the LM from the default of 5 (`--lm-order 4`)
-
-- Tune with MIRA instead of MERT (`--tuner mira`). This requires that Moses is 
installed.
-   
-- Decode with a wider beam (`--joshua-args '-pop-limit 200'`) (the default is 
100)
-
-- Add the provided BN-EN dictionary to the training data (add another 
`--corpus` line, e.g., `--corpus $FISHER/bn-en/dict.bn-en`)
-
-To do this, we will create new runs that partially reuse the results of 
previous runs. This is
-possible by doing two things: (1) incrementing the run directory and providing 
an updated README
-note; (2) telling the pipeline which of the many steps of the pipeline to 
begin at; and (3)
-providing the needed dependencies.
-
-# A second run
-
-Let's begin by changing the tuner, to see what effect that has. To do so, we 
change the run
-directory, tell the pipeline to start at the tuning step, and provide the 
needed dependencies:
-
-    $JOSHUA/bin/pipeline.pl           \
-      --rundir 2                      \
-      --readme "Tuning with MIRA"     \
-      --source bn                     \
-      --target en                     \
-      --corpus $FISHER/bn-en/tok/training.bn-en \
-      --tune $FISHER/bn-en/tok/dev.bn-en        \
-      --test $FISHER/bn-en/tok/devtest.bn-en    \
-      --first-step tune \
-      --tuner mira \
-      --grammar 1/grammar.gz \
-      --no-corpus-lm \
-      --lmfile 1/lm.gz
-      
- Here, we have essentially the same invocation, but we have told the pipeline 
to use a different
- MIRA, to start with tuning, and have provided it with the language model file 
and grammar it needs
- to execute the tuning step. 
- 
- Note that we have also told it not to build a language model. This is 
necessary because the
- pipeline always builds an LM on the target side of the training data, if 
provided, but we are
- supplying the language model that was already built. We could equivalently 
have removed the
- `--corpus` line.
- 
-## Changing the model type
-
-Let's compare the Hiero model we've already built to an SAMT model. We have to 
reextract the
-grammar, but can reuse the alignments and the language model:
-
-    $JOSHUA/bin/pipeline.pl           \
-      --rundir 3                      \
-      --readme "Baseline SAMT model"  \
-      --source bn                     \
-      --target en                     \
-      --corpus $FISHER/bn-en/tok/training.bn-en \
-      --tune $FISHER/bn-en/tok/dev.bn-en        \
-      --test $FISHER/bn-en/tok/devtest.bn-en    \
-      --alignment 1/alignments/training.align   \
-      --first-step parse \
-      --no-corpus-lm \
-      --lmfile 1/lm.gz
-
-See [the pipeline script page](pipeline.html#steps) for a list of all the 
steps.
-
-## Analyzing the results
-
-We now have three runs, in subdirectories 1, 2, and 3. We can display summary 
results from them
-using the `$JOSHUA/scripts/training/summarize.pl` script.

Reply via email to