http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/advanced.html
----------------------------------------------------------------------
diff --git a/6.0/advanced.html b/6.0/advanced.html
new file mode 100644
index 0000000..f4a3335
--- /dev/null
+++ b/6.0/advanced.html
@@ -0,0 +1,192 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Advanced features</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Advanced features</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/advanced.md
----------------------------------------------------------------------
diff --git a/6.0/advanced.md b/6.0/advanced.md
deleted file mode 100644
index 4997e73..0000000
--- a/6.0/advanced.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-layout: default6
-category: links
-title: Advanced features
----
-
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/bundle.html
----------------------------------------------------------------------
diff --git a/6.0/bundle.html b/6.0/bundle.html
new file mode 100644
index 0000000..1f0ee11
--- /dev/null
+++ b/6.0/bundle.html
@@ -0,0 +1,297 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Building a language pack</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Building a language pack</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p><em>The information in this page applies to Joshua 6.0.3 and 
greater</em>.</p>
+
+<p>Joshua distributes <a href="/language-packs">language packs</a>, which are 
models
+that have been trained and tuned for particular language pairs. You
+can easily create your own language pack after you have trained and
+tuned a model using the provided
+<code class="highlighter-rouge">$JOSHUA/scripts/support/run-bundler.py</code> 
script, which gathers files
+from a pipeline training directory and bundles them together for easy
+distribution and release.</p>
+
+<p>The script takes just two mandatory arguments in the following order:</p>
+
+<ol>
+  <li>The path to the Joshua configuration file to base the bundle
+on. This file should contain the tuned weights from the tuning run, so
+you can use either the final tuned file from the tuning run
+(<code class="highlighter-rouge">tune/joshua.config.final</code>) or from the 
test run
+(<code class="highlighter-rouge">test/model/joshua.config</code>).</li>
+  <li>The directory to place the language pack in. If this directory
+already exists, the script will die, unless you also pass <code 
class="highlighter-rouge">--force</code>.</li>
+</ol>
+
+<p>In addition, there are a number of other arguments that may be 
important.</p>
+
+<ul>
+  <li>
+    <p><code class="highlighter-rouge">--root /path/to/root</code>. If file 
paths in the Joshua config file are
+ not absolute, you need to provide relative root. If you specify a
+ tuned pipeline file (such as <code 
class="highlighter-rouge">tune/joshua.config.final</code> above), the
+ paths should all be absolute. If you instead provide a config file
+ from a previous run bundle (e.g., <code 
class="highlighter-rouge">test/model/joshua.config</code>), the
+ bundle directory above is the relative root.</p>
+  </li>
+  <li>
+    <p>The config file options that are used in the pipeline are likely not
+the ones you want if you release a model. For example, the tuning
+configuration file contains options that tell Joshua to output 300
+translation candidates for each sentence (<code 
class="highlighter-rouge">-top-n 300</code>) and to
+include lots of detail about each translation (<code 
class="highlighter-rouge">-output-format '%i
+||| %s ||| %f ||| %c'</code>).  Because of this, you will want to tell the
+run bundler to change many of the config file options to be more
+geared towards human-readable output. The default copy-config
+options are options are <code class="highlighter-rouge">-top-n 0 
-output-format %S -mark-oovs
+false</code>, which accomplishes exactly this (human readability).</p>
+  </li>
+  <li>
+    <p>A very important issue has to do with the translation model (the
+“TM”, also sometimes called the grammar or phrase table). The
+translation model can be very large, so that it takes a long time to
+load and to <a href="packing.html">pack</a>. To reduce this time during model
+training, the translation model is filtered against the tuning and
+testing data in the pipeline, and these filtered models will be what
+is listed in the source config files. However, when exporting a
+model for use as a language pack, you need to export the full model
+instead of the filtered one so as to maximize your coverage on new
+test data. The <code class="highlighter-rouge">--tm</code> parameter is used 
to accomplish this; it takes
+an argument specifying the path to the full model. If you would
+additionally like the large model to be <a href="packing.html">packed</a> (this
+is recommended; it reformats the TM so that it can be quickly loaded
+at run time), you can use <code class="highlighter-rouge">--pack-tm</code> 
instead. You can only pack one
+TM (but typically there is only TM anyway). Multiple <code 
class="highlighter-rouge">--tm</code>
+parameters can be passed; they will replace TMs found in the config
+file in the order they are found.</p>
+  </li>
+</ul>
+
+<p>Here is an example invocation for packing a hierarchical model using
+the final tuned Joshua config file:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>./run-bundler.py \
+  --force --verbose \
+  /path/to/rundir/tune/joshua.config.final \
+  language-pack-YYYY-MM-DD \
+  --root /path/to/rundir \
+  --pack-tm /path/to/rundir/grammar.gz \
+  --copy-config-options \ 
+    '-top-n 1 -output-format %S -mark-oovs false' \
+  --server-port 5674
+</code></pre>
+</div>
+
+<p>The copy config options tell the decoder to present just the
+single-best (<code class="highlighter-rouge">-top-n 0</code>) translated 
output string that has been
+heuristically capitalized (<code class="highlighter-rouge">-output-format 
%S</code>), to not append <code class="highlighter-rouge">_OOV</code>
+to OOVs (<code class="highlighter-rouge">-mark-oovs false</code>), and to use 
the translation model
+<code class="highlighter-rouge">/path/to/rundir/grammar.gz</code> as the main 
translation model, packing it
+before placing it in the bundle. Note that these arguments to
+<code class="highlighter-rouge">--copy-config</code> are the default, so you 
could leave this off entirely.
+See <a href="decoder.html">this page</a> for a longer list of decoder 
options.</p>
+
+<p>This command is a slight variation used for phrase-based models, which
+instead takes the test-set Joshua config (the result is the same):</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>./run-bundler.py \
+  --force --verbose \
+  /path/to/rundir/test/model/joshua.config \
+  --root /path/to/rundir/test/model \
+  language-pack-YYYY-MM-DD \
+  --pack-tm /path/to/rundir/model/phrase-table.gz \
+  --server-port 5674
+</code></pre>
+</div>
+
+<p>In both cases, a new directory <code 
class="highlighter-rouge">language-pack-YYYY-MM-DD</code> will be
+created along with a README and a number of support files.</p>
+
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/bundle.md
----------------------------------------------------------------------
diff --git a/6.0/bundle.md b/6.0/bundle.md
deleted file mode 100644
index f433172..0000000
--- a/6.0/bundle.md
+++ /dev/null
@@ -1,100 +0,0 @@
----
-layout: default6
-category: links
-title: Building a language pack
----
-
-*The information in this page applies to Joshua 6.0.3 and greater*.
-
-Joshua distributes [language packs](/language-packs), which are models
-that have been trained and tuned for particular language pairs. You
-can easily create your own language pack after you have trained and
-tuned a model using the provided
-`$JOSHUA/scripts/support/run-bundler.py` script, which gathers files
-from a pipeline training directory and bundles them together for easy
-distribution and release.
-
-The script takes just two mandatory arguments in the following order:
-
-1.  The path to the Joshua configuration file to base the bundle
-    on. This file should contain the tuned weights from the tuning run, so
-    you can use either the final tuned file from the tuning run
-    (`tune/joshua.config.final`) or from the test run
-    (`test/model/joshua.config`).
-1.  The directory to place the language pack in. If this directory
-    already exists, the script will die, unless you also pass `--force`.
-
-In addition, there are a number of other arguments that may be important.
-
-- `--root /path/to/root`. If file paths in the Joshua config file are
-   not absolute, you need to provide relative root. If you specify a
-   tuned pipeline file (such as `tune/joshua.config.final` above), the
-   paths should all be absolute. If you instead provide a config file
-   from a previous run bundle (e.g., `test/model/joshua.config`), the
-   bundle directory above is the relative root.
-
-- The config file options that are used in the pipeline are likely not
-  the ones you want if you release a model. For example, the tuning
-  configuration file contains options that tell Joshua to output 300
-  translation candidates for each sentence (`-top-n 300`) and to
-  include lots of detail about each translation (`-output-format '%i
-  ||| %s ||| %f ||| %c'`).  Because of this, you will want to tell the
-  run bundler to change many of the config file options to be more
-  geared towards human-readable output. The default copy-config
-  options are options are `-top-n 0 -output-format %S -mark-oovs
-  false`, which accomplishes exactly this (human readability).
-  
-- A very important issue has to do with the translation model (the
-  "TM", also sometimes called the grammar or phrase table). The
-  translation model can be very large, so that it takes a long time to
-  load and to [pack](packing.html). To reduce this time during model
-  training, the translation model is filtered against the tuning and
-  testing data in the pipeline, and these filtered models will be what
-  is listed in the source config files. However, when exporting a
-  model for use as a language pack, you need to export the full model
-  instead of the filtered one so as to maximize your coverage on new
-  test data. The `--tm` parameter is used to accomplish this; it takes
-  an argument specifying the path to the full model. If you would
-  additionally like the large model to be [packed](packing.html) (this
-  is recommended; it reformats the TM so that it can be quickly loaded
-  at run time), you can use `--pack-tm` instead. You can only pack one
-  TM (but typically there is only TM anyway). Multiple `--tm`
-  parameters can be passed; they will replace TMs found in the config
-  file in the order they are found.
-
-Here is an example invocation for packing a hierarchical model using
-the final tuned Joshua config file:
-
-    ./run-bundler.py \
-      --force --verbose \
-      /path/to/rundir/tune/joshua.config.final \
-      language-pack-YYYY-MM-DD \
-      --root /path/to/rundir \
-      --pack-tm /path/to/rundir/grammar.gz \
-      --copy-config-options \ 
-        '-top-n 1 -output-format %S -mark-oovs false' \
-      --server-port 5674
-
-The copy config options tell the decoder to present just the
-single-best (`-top-n 0`) translated output string that has been
-heuristically capitalized (`-output-format %S`), to not append `_OOV`
-to OOVs (`-mark-oovs false`), and to use the translation model
-`/path/to/rundir/grammar.gz` as the main translation model, packing it
-before placing it in the bundle. Note that these arguments to
-`--copy-config` are the default, so you could leave this off entirely.
-See [this page](decoder.html) for a longer list of decoder options.
-
-This command is a slight variation used for phrase-based models, which
-instead takes the test-set Joshua config (the result is the same):
-
-    ./run-bundler.py \
-      --force --verbose \
-      /path/to/rundir/test/model/joshua.config \
-      --root /path/to/rundir/test/model \
-      language-pack-YYYY-MM-DD \
-      --pack-tm /path/to/rundir/model/phrase-table.gz \
-      --server-port 5674
-
-In both cases, a new directory `language-pack-YYYY-MM-DD` will be
-created along with a README and a number of support files.
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/decoder.html
----------------------------------------------------------------------
diff --git a/6.0/decoder.html b/6.0/decoder.html
new file mode 100644
index 0000000..45d238b
--- /dev/null
+++ b/6.0/decoder.html
@@ -0,0 +1,671 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta http-equiv="X-UA-Compatible" content="IE=edge">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <meta name="description" content="">
+    <meta name="author" content="">
+    <link rel="icon" href="../../favicon.ico">
+
+    <title>Joshua Documentation | Decoder configuration parameters</title>
+
+    <!-- Bootstrap core CSS -->
+    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
+
+    <!-- Custom styles for this template -->
+    <link href="/joshua6.css" rel="stylesheet">
+  </head>
+
+  <body>
+
+    <div class="blog-masthead">
+      <div class="container">
+        <nav class="blog-nav">
+          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
+          <a class="blog-nav-item" href="/">Joshua</a>
+          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New 
features</a> -->
+          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
+          <a class="blog-nav-item" href="/data/">Datasets</a>
+          <a class="blog-nav-item" href="/support/">Support</a>
+          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
+        </nav>
+      </div>
+    </div>
+
+    <div class="container">
+
+      <div class="row">
+
+        <div class="col-sm-2">
+          <div class="sidebar-module">
+            <!-- <h4>About</h4> -->
+            <center>
+            <img src="/images/joshua-logo-small.png" />
+            <p>Joshua machine translation toolkit</p>
+            </center>
+          </div>
+          <hr>
+          <center>
+            <a href="/releases/current/" target="_blank"><button 
class="button">Download Joshua 6.0.5</button></a>
+            <br />
+            <a href="/releases/runtime/" target="_blank"><button 
class="button">Runtime only version</button></a>
+            <p>Released November 5, 2015</p>
+          </center>
+          <hr>
+          <!-- <div class="sidebar-module"> -->
+          <!--   <span id="download"> -->
+          <!--     <a 
href="http://joshua-decoder.org/downloads/joshua-6.0.tgz";>Download</a> -->
+          <!--   </span> -->
+          <!-- </div> -->
+          <div class="sidebar-module">
+            <h4>Using Joshua</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/install.html">Installation</a></li>
+              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
+            </ol>
+          </div>
+          <hr>
+          <div class="sidebar-module">
+            <h4>Building new models</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
+              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
+              <li><a href="/6.0/faq.html">FAQ</a></li>
+            </ol>
+          </div>
+<!--
+          <div class="sidebar-module">
+            <h4>Phrase-based</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/phrase.html">Training</a></li>
+            </ol>
+          </div>
+-->
+          <hr>
+          <div class="sidebar-module">
+            <h4>Advanced</h4>
+            <ol class="list-unstyled">
+              <li><a href="/6.0/bundle.html">Building language packs</a></li>
+              <li><a href="/6.0/decoder.html">Decoder options</a></li>
+              <li><a href="/6.0/file-formats.html">File formats</a></li>
+              <li><a href="/6.0/packing.html">Packing TMs</a></li>
+              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
+            </ol>
+          </div>
+
+          <hr> 
+          <div class="sidebar-module">
+            <h4>Developer</h4>
+            <ol class="list-unstyled">              
+               <li><a 
href="https://github.com/joshua-decoder/joshua";>Github</a></li>
+               <li><a 
href="http://cs.jhu.edu/~post/joshua-docs";>Javadoc</a></li>
+               <li><a 
href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers";>Mailing
 list</a></li>              
+            </ol>
+          </div>
+
+        </div><!-- /.blog-sidebar -->
+
+        
+        <div class="col-sm-8 blog-main">
+        
+
+          <div class="blog-title">
+            <h2>Decoder configuration parameters</h2>
+          </div>
+          
+          <div class="blog-post">
+
+            <p>Joshua configuration parameters affect the runtime behavior of 
the decoder itself.  This page
+describes the complete list of these parameters and describes how to invoke 
the decoder manually.</p>
+
+<p>To run the decoder, a convenience script is provided that loads the 
necessary Java libraries.
+Assuming you have set the environment variable <code 
class="highlighter-rouge">$JOSHUA</code> to point to the root of your 
installation,
+its syntax is:</p>
+
+<div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/decoder [-m memory-amount] [-c config-file 
other-joshua-options ...]
+</code></pre>
+</div>
+
+<p>The <code class="highlighter-rouge">-m</code> argument, if present, must 
come first, and the memory specification is in Java format
+(e.g., 400m, 4g, 50g).  Most notably, the suffixes “m” and “g” are 
used for “megabytes” and
+“gigabytes”, and there cannot be a space between the number and the unit.  
The value of this
+argument is passed to Java itself in the invocation of the decoder, and the 
remaining options are
+passed to Joshua.  The <code class="highlighter-rouge">-c</code> parameter has 
special import because it specifies the location of the
+configuration file.</p>
+
+<p>The Joshua decoder works by reading from STDIN and printing translations to 
STDOUT as they are
+received, according to a number of <a href="#output">output options</a>.  If 
no run-time parameters are
+specified (e.g., no translation model), sentences are simply pushed through 
untranslated.  Blank
+lines are similarly pushed through as blank lines, so as to maintain 
parallelism with the input.</p>
+
+<p>Parameters can be provided to Joshua via a configuration file and from the 
command
+line.  Command-line arguments override values found in the configuration file. 
 The format for
+configuration file parameters is</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>parameter = value
+</code></pre>
+</div>
+
+<p>Command-line options are specified in the following format</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>-parameter value
+</code></pre>
+</div>
+
+<p>Values are one of four types (which we list here mostly to call attention 
to the boolean format):</p>
+
+<ul>
+  <li>STRING, an arbitrary string (no spaces)</li>
+  <li>FLOAT, a floating-point value</li>
+  <li>INT, an integer</li>
+  <li>
+    <p>BOOLEAN, a boolean value.  For booleans, <code 
class="highlighter-rouge">true</code> evaluates to true, and all other values 
evaluate
+to false.  For command-line options, the value may be omitted, in which case 
it evaluates to
+true.  For example, the following are equivalent:</p>
+
+    <div class="highlighter-rouge"><pre 
class="highlight"><code>$JOSHUA/bin/decoder -mark-oovs true
+$JOSHUA/bin/decoder -mark-oovs
+</code></pre>
+    </div>
+  </li>
+</ul>
+
+<h2 id="joshua-configuration-file">Joshua configuration file</h2>
+
+<p>In addition to the decoder parameters described below, the configuration 
file contains the model
+feature weights.  These weights are distinguished from runtime parameters in 
that they are delimited
+by a space instead of an equals sign. They take the following
+format, and by convention are placed at the end of the configuration file:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>lm_0 4.23
+tm_pt_0 -0.2
+OOVPenalty -100
+</code></pre>
+</div>
+
+<p>Joshua can make use of thousands of features, which are described in 
further detail in the
+<a href="features.html">feature file</a>.</p>
+
+<h2 id="joshua-decoder-parameters">Joshua decoder parameters</h2>
+
+<p>This section contains a list of the Joshua run-time parameters.  An 
important note about the
+parameters is that they are collapsed to canonical form, in which dashes (-) 
and underscores (-) are
+removed and case is converted to lowercase.  For example, the following 
parameter forms are
+equivalent (either in the configuration file or from the command line):</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span 
class="p">{</span><span class="err">top-n,</span><span class="w"> </span><span 
class="err">topN,</span><span class="w"> </span><span 
class="err">top_n,</span><span class="w"> </span><span 
class="err">TOP_N,</span><span class="w"> </span><span 
class="err">t-o-p-N</span><span class="p">}</span><span class="w">
+</span><span class="p">{</span><span class="err">poplimit,</span><span 
class="w"> </span><span class="err">pop-limit,</span><span class="w"> 
</span><span class="err">pop-limit,</span><span class="w"> </span><span 
class="err">popLimit,PoPlImIt</span><span class="p">}</span><span class="w">
+</span></code></pre>
+</div>
+
+<p>This basically defines equivalence classes of parameters, and relieves you 
of the task of having to
+remember the exact format of each parameter.</p>
+
+<p>In what follows, we group the configuration parameters in the following 
groups:</p>
+
+<ul>
+  <li><a href="#general">General options</a></li>
+  <li><a href="#pruning">Pruning</a></li>
+  <li><a href="#tm">Translation model options</a></li>
+  <li><a href="#lm">Language model options</a></li>
+  <li><a href="#output">Output options</a></li>
+  <li><a href="#modes">Alternate modes of operation</a></li>
+</ul>
+
+<p><a id="general"></a></p>
+
+<h3 id="general-decoder-options">General decoder options</h3>
+
+<ul>
+  <li>
+    <p><code class="highlighter-rouge">c</code>, <code 
class="highlighter-rouge">config</code> — <em>NULL</em></p>
+
+    <p>Specifies the configuration file from which Joshua options are loaded.  
This feature is unique in
+ that it must be specified from the command line (obviously).</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">amortize</code> — <em>true</em></p>
+
+    <p>When true, specifies that sorting of the rule lists at each trie node 
in the grammar should be
+delayed until the trie node is accessed. When false, all such nodes are sorted 
before decoding
+even begins. Setting to true results in slower per-sentence decoding, but 
allows the decoder to
+begin translating almost immediately (especially with large grammars).</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">server-port</code> — <em>0</em></p>
+
+    <p>If set to a nonzero value, Joshua will start a multithreaded TCP/IP 
server on the specified
+port. Clients can connect to it directly through programming APIs or 
command-line tools like
+<code class="highlighter-rouge">telnet</code> or <code 
class="highlighter-rouge">nc</code>.</p>
+
+    <div class="highlighter-rouge"><pre class="highlight"><code>$ 
$JOSHUA/bin/decoder -m 30g -c /path/to/config/file -server-port 8723
+...
+$ cat input.txt | nc localhost 8723 &gt; results.txt
+</code></pre>
+    </div>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">maxlen</code> — <em>200</em></p>
+
+    <p>Input sentences longer than this are truncated.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">feature-function</code></p>
+
+    <p>Enables a particular feature function. See the <a 
href="features.html">feature function page</a> for more information.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">oracle-file</code> — <em>NULL</em></p>
+
+    <p>The location of a set of oracle reference translations, parallel to the 
input.  When present,
+after producing the hypergraph by decoding the input sentence, the oracle is 
used to rescore the
+translation forest with a BLEU approximation in order to extract the 
oracle-translation from the
+forest.  This is useful for obtaining an (approximation to an) upper bound on 
your translation
+model under particular search settings.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">default-nonterminal</code> — 
<em>“X”</em></p>
+
+    <p>This is the nonterminal symbol assigned to out-of-vocabulary (OOV) 
items. Joshua assigns this
+ label to every word of the input, in fact, so that even known words can be 
translated as OOVs, if
+ the model prefers them. Usually, a very low weight on the <code 
class="highlighter-rouge">OOVPenalty</code> feature discourages their
+ use unless necessary.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">goal-symbol</code> — 
<em>“GOAL”</em></p>
+
+    <p>This is the symbol whose presence in the chart over the whole input 
span denotes a successful
+ parse (translation).  It should match the LHS nonterminal in your glue 
grammar.  Internally,
+ Joshua represents nonterminals enclosed in square brackets (e.g., 
“[GOAL]”), which you can
+ optionally supply in the configuration file.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">true-oovs-only</code> — 
<em>false</em></p>
+
+    <p>By default, Joshua creates an OOV entry for every word in the source 
sentence, regardless of
+whether it is found in the grammar.  This allows every word to be pushed 
through untranslated
+(although potentially incurring a high cost based on the <code 
class="highlighter-rouge">OOVPenalty</code> feature).  If this option is
+set, then only true OOVs are entered into the chart as OOVs. To determine 
“true” OOVs, Joshua
+examines the first level of the grammar trie for each word of the input (this 
isn’t a perfect
+heuristic, since a word could be present only in deeper levels of the 
trie).</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">threads</code>, <code 
class="highlighter-rouge">num-parallel-decoders</code> — <em>1</em></p>
+
+    <p>This determines how many simultaneous decoding threads to launch.  </p>
+
+    <p>Outputs are assembled in order and Joshua has to hold on to the 
complete target hypergraph until
+it is ready to be processed for output, so too many simultaneous threads could 
result in lots of
+memory usage if a long sentence results in many sentences being queued up.  We 
have run Joshua
+with as many as 64 threads without any problems of this kind, but it’s 
useful to keep in the back
+of your mind.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">weights-file</code> — NULL</p>
+
+    <p>Weights are appended to the end of the Joshua configuration file, by 
convention. If you prefer to
+put them in a separate file, you can do so, and point to the file with this 
parameter.</p>
+  </li>
+</ul>
+
+<h3 id="pruning-options-a-idpruning-">Pruning options <a id="pruning"></a></h3>
+
+<ul>
+  <li>
+    <p><code class="highlighter-rouge">pop-limit</code> — <em>100</em></p>
+
+    <p>The number of cube-pruning hypotheses that are popped from the 
candidates list for each span of
+the input.  Higher values result in a larger portion of the search space being 
explored at the
+cost of an increased search time. For exhaustive search, set <code 
class="highlighter-rouge">pop-limit</code> to 0.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">filter-grammar</code> — false</p>
+
+    <p>Set to true, this enables dynamic sentence-level filtering. For each 
sentence, each grammar is
+filtered at runtime down to rules that can be applied to the sentence under 
consideration. This
+takes some time (which we haven’t thoroughly quantified), but can result in 
the removal of many
+rules that are only partially applicable to the sentence.</p>
+  </li>
+  <li><code class="highlighter-rouge">constrain-parse</code> — 
<em>false</em></li>
+  <li>
+    <p><code class="highlighter-rouge">use_pos_labels</code> — 
<em>false</em></p>
+
+    <p><em>These features are not documented.</em></p>
+  </li>
+</ul>
+
+<h3 id="translation-model-options-a-idtm-">Translation model options <a 
id="tm"></a></h3>
+
+<p>Joshua supports any number of translation models. Conventionally, two are 
supplied: the main grammar
+containing translation rules, and the glue grammar for patching things 
together. Internally, Joshua
+doesn’t distinguish between the roles of these grammars; they are treated 
differently only in that
+they typically have different span limits (the maximum input width they can be 
applied to).</p>
+
+<p>Grammars are instantiated with config file lines of the following form:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>tm = TYPE OWNER 
SPAN_LIMIT FILE
+</code></pre>
+</div>
+
+<ul>
+  <li><code class="highlighter-rouge">TYPE</code> is the grammar type, which 
must be set to “thrax”. </li>
+  <li><code class="highlighter-rouge">OWNER</code> is the grammar’s owner, 
which defines the set of <a href="features.html">feature weights</a> that
+apply to the weights found in each line of the grammar (using different owners 
allows each grammar
+to have different sets and numbers of weights, while sharing owners allows 
weights to be shared
+across grammars).</li>
+  <li><code class="highlighter-rouge">SPAN_LIMIT</code> is the maximum span of 
the input that rules from this grammar can be applied to. A
+span limit of 0 means “no limit”, while a span limit of -1 means that 
rules from this grammar must
+be anchored to the left side of the sentence (index 0).</li>
+  <li><code class="highlighter-rouge">FILE</code> is the path to the file 
containing the grammar. If the file is a directory, it is assumed
+to be <a href="packed.html">packed</a>. Only one packed grammar can currently 
be used at a time.</li>
+</ul>
+
+<p>For reference, the following two translation model lines are used by the <a 
href="pipeline.html">pipeline</a>:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>tm = thrax pt 20 
/path/to/packed/grammar
+tm = thrax glue -1 /path/to/glue/grammar
+</code></pre>
+</div>
+
+<h3 id="language-model-options-a-idlm-">Language model options <a 
id="lm"></a></h3>
+
+<p>Joshua supports any number of language models. With Joshua 6.0, these
+are just regular feature functions:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>feature-function = 
LanguageModel -lm_file /path/to/lm/file -lm_order N -lm_type TYPE
+feature-function = StateMinimizingLanguageModel -lm_file /path/to/lm/file 
-lm_order N -lm_type TYPE
+</code></pre>
+</div>
+
+<p><code class="highlighter-rouge">LanguageModel</code> is a generic language 
model, supporting types ‘kenlm’
+(the default) and ‘berkeleylm’. <code 
class="highlighter-rouge">StateMinimizingLanguageModel</code>
+implements LM state minimization to reduce the size of context n-grams
+where appropriate
+(<a href="http://www.aclweb.org/anthology/W08-0402.pdf";>Li and Khudanpur, 
2008</a>;
+<a href="https://aclweb.org/anthology/N/N13/N13-1116.pdf";>Heafield et al., 
2013</a>). This
+is currently only supported by KenLM, so the <code 
class="highlighter-rouge">-lm_type</code> option is not
+available here.</p>
+
+<p>The other key/value pairs are defined as follows:</p>
+
+<ul>
+  <li><code class="highlighter-rouge">lm_type</code>: one of “kenlm” 
“berkeleylm”</li>
+  <li><code class="highlighter-rouge">lm_order</code>: the order of the 
language model</li>
+  <li><code class="highlighter-rouge">lm_file</code>: the path to the language 
model file.  All language model
+ types support the standard ARPA format.  Additionally, if the LM
+ type is “kenlm”, this file can be compiled into KenLM’s compiled
+ format (using the program at <code 
class="highlighter-rouge">$JOSHUA/bin/build_binary</code>); if the
+ the LM type is “berkeleylm”, it can be compiled by following the
+ directions in
+ <code 
class="highlighter-rouge">$JOSHUA/src/joshua/decoder/ff/lm/berkeley_lm/README</code>.
 The
+ <a href="pipeline.html">pipeline</a> will automatically compile either 
type.</li>
+</ul>
+
+<p>For each language model, you need to specify a feature weight in the 
following format:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>lm_0 WEIGHT
+lm_1 WEIGHT
+...
+</code></pre>
+</div>
+
+<p>where the indices correspond to the order of the language model declaration 
lines.</p>
+
+<h3 id="output-options-a-idoutput-">Output options <a id="output"></a></h3>
+
+<ul>
+  <li>
+    <p><code class="highlighter-rouge">output-format</code> <em>New in 
5.0</em></p>
+
+    <p>Joshua prints a lot of information to STDERR (making this more granular 
is on the TODO
+list). Output to STDOUT is reserved for decoder translations, and is 
controlled by the</p>
+
+    <ul>
+      <li>
+        <p><code class="highlighter-rouge">%i</code>: the sentence number 
(0-indexed)</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%e</code>: the source sentence</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%s</code>: the translated 
sentence</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%S</code>: the translated sentence, 
with some basic capitalization and denomralization. e.g.,</p>
+
+        <div class="highlighter-rouge"><pre class="highlight"><code>$ echo "¿ 
who you lookin' at , mr. ?" | $JOSHUA/bin/decoder -output-format "%S" 
-mark-oovs false 2&gt; /dev/null 
+¿Who you lookin' at, Mr.? 
+</code></pre>
+        </div>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%t</code>: the target-side tree 
projection, all printed on one line (PTB style)</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%d</code>: the synchronous 
derivation, with each rules printed indented on their own lines</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%f</code>: the list of feature 
values (as name=value pairs)</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%c</code>: the model cost</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%w</code>: the weight vector 
(unimplemented)</p>
+      </li>
+      <li>
+        <p><code class="highlighter-rouge">%a</code>: the alignments between 
source and target words (currently broken for hierarchical mode)</p>
+      </li>
+    </ul>
+
+    <p>The default value is</p>
+
+    <div class="highlighter-rouge"><pre class="highlight"><code>output-format 
= %i ||| %s ||| %f ||| %c
+</code></pre>
+    </div>
+
+    <p>i.e.,</p>
+
+    <div class="highlighter-rouge"><pre class="highlight"><code>input ID ||| 
translation ||| model scores ||| score
+</code></pre>
+    </div>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">top-n</code> — <em>300</em></p>
+
+    <p>The number of translation hypotheses to output, sorted in decreasing 
order of model score</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">use-unique-nbest</code> — 
<em>true</em></p>
+
+    <p>When constructing the n-best list for a sentence, skip hypotheses whose 
string has already been
+output.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">escape-trees</code> — 
<em>false</em></p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">include-align-index</code> — 
<em>false</em></p>
+
+    <p>Output the source words indices that each target word aligns to.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">mark-oovs</code> — <em>false</em></p>
+
+    <p>if <code class="highlighter-rouge">true</code>, this causes the text 
“_OOV” to be appended to each untranslated word in the output.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">visualize-hypergraph</code> — 
<em>false</em></p>
+
+    <p>If set to true, a visualization of the hypergraph will be displayed, 
though you will have to
+explicitly include the relevant jar files.  See the example usage in
+<code class="highlighter-rouge">$JOSHUA/examples/tree_visualizer/</code>, 
which contains a demonstration of a source sentence,
+translation, and synchronous derivation.</p>
+  </li>
+  <li>
+    <p><code class="highlighter-rouge">dump-hypergraph</code> — “”</p>
+
+    <p>This feature directs that the hypergraph should be written to disk for 
each input sentence. If
+set, the value should contain the string “%d”, which is replaced with the 
sentence number. For
+example,</p>
+
+    <div class="highlighter-rouge"><pre class="highlight"><code>cat input.txt 
| $JOSHUA/bin/decoder -dump-hypergraph hgs/%d.txt
+</code></pre>
+    </div>
+
+    <p>Note that the output directory must exist.</p>
+
+    <p>TODO: revive the
+<a 
href="http://aclweb.org/aclwiki/index.php?title=Hypergraph_Format";>discussion 
on a common hypergraph format</a>
+on the ACL Wiki and support that format.</p>
+  </li>
+</ul>
+
+<h3 id="lattice-decoding">Lattice decoding</h3>
+
+<p>In addition to regular sentences, Joshua can decode weighted lattices 
encoded in
+<a href="http://www.statmt.org/moses/?n=Moses.WordLattices";>the PLF 
format</a>, except that path costs should
+be listed as <b>log probabilities</b> instead of probabilities.  Lattice 
decoding was originally
+added by Lane Schwartz and <a href="http://www.cs.cmu.edu/~cdyer/";>Chris 
Dyer</a>.</p>
+
+<p>Joshua will automatically detect whether the input sentence is a regular 
sentence (the usual case)
+or a lattice.  If a lattice, a feature will be activated that accumulates the 
cost of different
+paths through the lattice.  In this case, you need to ensure that a weight for 
this feature is
+present in <a href="decoder.html">your model file</a>. The <a 
href="pipeline.html">pipeline</a> will handle this
+automatically, or if you are doing this manually, you can add the line</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>SourcePath COST
+</code></pre>
+</div>
+
+<p>to your Joshua configuration file.    </p>
+
+<p>Lattices must be listed one per line.</p>
+
+<h3 id="alternate-modes-of-operation-a-idmodes-">Alternate modes of operation 
<a id="modes"></a></h3>
+
+<p>In addition to decoding input sentences in the standard way, Joshua 
supports both <em>constrained
+decoding</em> and <em>synchronous parsing</em>. In both settings, both the 
source and target sides are provided
+as input, and the decoder finds a derivation between them.</p>
+
+<h4 id="constrained-decoding">Constrained decoding</h4>
+
+<p>To enable constrained decoding, simply append the desired target string as 
part of the input, in
+the following format:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>source sentence 
||| target sentence
+</code></pre>
+</div>
+
+<p>Joshua will translate the source sentence constrained to the target 
sentence. There are a few
+caveats:</p>
+
+<ul>
+  <li>
+    <p>Left-state minimization cannot be enabled for the language model</p>
+  </li>
+  <li>
+    <p>A heuristic is used to constrain the derivation (the LM state must 
match against the
+input). This is not a perfect heuristic, and sometimes results in analyses 
that are not
+perfectly constrained to the input, but have extra words.</p>
+  </li>
+</ul>
+
+<h4 id="synchronous-parsing">Synchronous parsing</h4>
+
+<p>Joshua supports synchronous parsing as a two-step sequence of monolingual 
parses, as described in
+Dyer (NAACL 2010) (<a 
href="http://www.aclweb.org/anthology/N10-1033‎.pdf";>PDF</a>). To enable 
this:</p>
+
+<ul>
+  <li>
+    <p>Set the configuration parameter <code class="highlighter-rouge">parse = 
true</code>.</p>
+  </li>
+  <li>
+    <p>Remove all language models from the input file </p>
+  </li>
+  <li>
+    <p>Provide input in the following format:</p>
+
+    <div class="highlighter-rouge"><pre class="highlight"><code> source 
sentence ||| target sentence
+</code></pre>
+    </div>
+  </li>
+</ul>
+
+<p>You may also wish to display the synchronouse parse tree (<code 
class="highlighter-rouge">-output-format %t</code>) and the alignment
+(<code class="highlighter-rouge">-show-align-index</code>).</p>
+
+
+
+          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
+
+          <!--   <p>This blog post shows a few different types of content 
that's supported and styled with Bootstrap. Basic typography, images, and code 
are all supported.</p> -->
+          <!--   <hr> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis 
parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque 
ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at 
lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
+          <!--   <blockquote> -->
+          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis 
risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh 
ultricies vehicula ut id elit.</p> -->
+          <!--   </blockquote> -->
+          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. 
Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla 
sed consectetur.</p> -->
+          <!--   <h2>Heading</h2> -->
+          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus 
dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, 
eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, 
vestibulum at eros.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</p> -->
+          <!--   <pre><code>Example code block</code></pre> -->
+          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta 
sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, 
tortor mauris condimentum nibh, ut fermentum massa.</p> -->
+          <!--   <h3>Sub-heading</h3> -->
+          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. 
Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus 
commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet 
risus.</p> -->
+          <!--   <ul> -->
+          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl 
consectetur et.</li> -->
+          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> 
-->
+          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
+          <!--   </ul> -->
+          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla 
vitae elit libero, a pharetra augue.</p> -->
+          <!--   <ol> -->
+          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> 
-->
+          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient 
montes, nascetur ridiculus mus.</li> -->
+          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet 
non magna.</li> -->
+          <!--   </ol> -->
+          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed 
posuere consectetur est at lobortis.</p> -->
+          <!-- </div><\!-- /.blog-post -\-> -->
+
+        </div>
+
+      </div><!-- /.row -->
+
+      
+        
+    </div><!-- /.container -->
+
+    <!-- Bootstrap core JavaScript
+    ================================================== -->
+    <!-- Placed at the end of the document so the pages load faster -->
+    <script 
src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js";></script>
+    <script src="../../dist/js/bootstrap.min.js"></script>
+    <!-- <script src="../../assets/js/docs.min.js"></script> -->
+    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
+    <!-- <script 
src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
+    -->
+
+    <!-- Start of StatCounter Code for Default Guide -->
+    <script type="text/javascript">
+      var sc_project=8264132; 
+      var sc_invisible=1; 
+      var sc_security="4b97fe2d"; 
+    </script>
+    <script type="text/javascript" 
src="http://www.statcounter.com/counter/counter.js";></script>
+    <noscript>
+      <div class="statcounter">
+        <a title="hit counter joomla" 
+           href="http://statcounter.com/joomla/";
+           target="_blank">
+          <img class="statcounter"
+               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/";
+               alt="hit counter joomla" />
+        </a>
+      </div>
+    </noscript>
+    <!-- End of StatCounter Code for Default Guide -->
+  </body>
+</html>
+

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/53cc3005/6.0/decoder.md
----------------------------------------------------------------------
diff --git a/6.0/decoder.md b/6.0/decoder.md
deleted file mode 100644
index e8dc8c9..0000000
--- a/6.0/decoder.md
+++ /dev/null
@@ -1,385 +0,0 @@
----
-layout: default6
-category: links
-title: Decoder configuration parameters
----
-
-Joshua configuration parameters affect the runtime behavior of the decoder 
itself.  This page
-describes the complete list of these parameters and describes how to invoke 
the decoder manually.
-
-To run the decoder, a convenience script is provided that loads the necessary 
Java libraries.
-Assuming you have set the environment variable `$JOSHUA` to point to the root 
of your installation,
-its syntax is:
-
-    $JOSHUA/bin/decoder [-m memory-amount] [-c config-file 
other-joshua-options ...]
-
-The `-m` argument, if present, must come first, and the memory specification 
is in Java format
-(e.g., 400m, 4g, 50g).  Most notably, the suffixes "m" and "g" are used for 
"megabytes" and
-"gigabytes", and there cannot be a space between the number and the unit.  The 
value of this
-argument is passed to Java itself in the invocation of the decoder, and the 
remaining options are
-passed to Joshua.  The `-c` parameter has special import because it specifies 
the location of the
-configuration file.
-
-The Joshua decoder works by reading from STDIN and printing translations to 
STDOUT as they are
-received, according to a number of [output options](#output).  If no run-time 
parameters are
-specified (e.g., no translation model), sentences are simply pushed through 
untranslated.  Blank
-lines are similarly pushed through as blank lines, so as to maintain 
parallelism with the input.
-
-Parameters can be provided to Joshua via a configuration file and from the 
command
-line.  Command-line arguments override values found in the configuration file. 
 The format for
-configuration file parameters is
-
-    parameter = value
-
-Command-line options are specified in the following format
-
-    -parameter value
-
-Values are one of four types (which we list here mostly to call attention to 
the boolean format):
-
-- STRING, an arbitrary string (no spaces)
-- FLOAT, a floating-point value
-- INT, an integer
-- BOOLEAN, a boolean value.  For booleans, `true` evaluates to true, and all 
other values evaluate
-  to false.  For command-line options, the value may be omitted, in which case 
it evaluates to
-  true.  For example, the following are equivalent:
-
-      $JOSHUA/bin/decoder -mark-oovs true
-      $JOSHUA/bin/decoder -mark-oovs
-
-## Joshua configuration file
-
-In addition to the decoder parameters described below, the configuration file 
contains the model
-feature weights.  These weights are distinguished from runtime parameters in 
that they are delimited
-by a space instead of an equals sign. They take the following
-format, and by convention are placed at the end of the configuration file:
-
-    lm_0 4.23
-    tm_pt_0 -0.2
-    OOVPenalty -100
-   
-Joshua can make use of thousands of features, which are described in further 
detail in the
-[feature file](features.html).
-
-## Joshua decoder parameters
-
-This section contains a list of the Joshua run-time parameters.  An important 
note about the
-parameters is that they are collapsed to canonical form, in which dashes (-) 
and underscores (-) are
-removed and case is converted to lowercase.  For example, the following 
parameter forms are
-equivalent (either in the configuration file or from the command line):
-
-    {top-n, topN, top_n, TOP_N, t-o-p-N}
-    {poplimit, pop-limit, pop-limit, popLimit,PoPlImIt}
-
-This basically defines equivalence classes of parameters, and relieves you of 
the task of having to
-remember the exact format of each parameter.
-
-In what follows, we group the configuration parameters in the following groups:
-
-- [General options](#general)
-- [Pruning](#pruning)
-- [Translation model options](#tm)
-- [Language model options](#lm)
-- [Output options](#output)
-- [Alternate modes of operation](#modes)
-
-<a id="general" />
-
-### General decoder options
-
-- `c`, `config` --- *NULL*
-
-   Specifies the configuration file from which Joshua options are loaded.  
This feature is unique in
-   that it must be specified from the command line (obviously).
-
-- `amortize` --- *true*
-
-  When true, specifies that sorting of the rule lists at each trie node in the 
grammar should be
-  delayed until the trie node is accessed. When false, all such nodes are 
sorted before decoding
-  even begins. Setting to true results in slower per-sentence decoding, but 
allows the decoder to
-  begin translating almost immediately (especially with large grammars).
-
-- `server-port` --- *0*
-
-  If set to a nonzero value, Joshua will start a multithreaded TCP/IP server 
on the specified
-  port. Clients can connect to it directly through programming APIs or 
command-line tools like
-  `telnet` or `nc`.
-  
-      $ $JOSHUA/bin/decoder -m 30g -c /path/to/config/file -server-port 8723
-      ...
-      $ cat input.txt | nc localhost 8723 > results.txt
-
-- `maxlen` --- *200*
-
-  Input sentences longer than this are truncated.
-
-- `feature-function`
-
-  Enables a particular feature function. See the [feature function 
page](features.html) for more information.
-
-- `oracle-file` --- *NULL*
-
-  The location of a set of oracle reference translations, parallel to the 
input.  When present,
-  after producing the hypergraph by decoding the input sentence, the oracle is 
used to rescore the
-  translation forest with a BLEU approximation in order to extract the 
oracle-translation from the
-  forest.  This is useful for obtaining an (approximation to an) upper bound 
on your translation
-  model under particular search settings.
-
-- `default-nonterminal` --- *"X"*
-
-   This is the nonterminal symbol assigned to out-of-vocabulary (OOV) items. 
Joshua assigns this
-   label to every word of the input, in fact, so that even known words can be 
translated as OOVs, if
-   the model prefers them. Usually, a very low weight on the `OOVPenalty` 
feature discourages their
-   use unless necessary.
-
-- `goal-symbol` --- *"GOAL"*
-
-   This is the symbol whose presence in the chart over the whole input span 
denotes a successful
-   parse (translation).  It should match the LHS nonterminal in your glue 
grammar.  Internally,
-   Joshua represents nonterminals enclosed in square brackets (e.g., 
"[GOAL]"), which you can
-   optionally supply in the configuration file.
-
-- `true-oovs-only` --- *false*
-
-  By default, Joshua creates an OOV entry for every word in the source 
sentence, regardless of
-  whether it is found in the grammar.  This allows every word to be pushed 
through untranslated
-  (although potentially incurring a high cost based on the `OOVPenalty` 
feature).  If this option is
-  set, then only true OOVs are entered into the chart as OOVs. To determine 
"true" OOVs, Joshua
-  examines the first level of the grammar trie for each word of the input 
(this isn't a perfect
-  heuristic, since a word could be present only in deeper levels of the trie).
-
-- `threads`, `num-parallel-decoders` --- *1*
-
-  This determines how many simultaneous decoding threads to launch.  
-
-  Outputs are assembled in order and Joshua has to hold on to the complete 
target hypergraph until
-  it is ready to be processed for output, so too many simultaneous threads 
could result in lots of
-  memory usage if a long sentence results in many sentences being queued up.  
We have run Joshua
-  with as many as 64 threads without any problems of this kind, but it's 
useful to keep in the back
-  of your mind.
-  
-- `weights-file` --- NULL
-
-  Weights are appended to the end of the Joshua configuration file, by 
convention. If you prefer to
-  put them in a separate file, you can do so, and point to the file with this 
parameter.
-
-### Pruning options <a id="pruning" />
-
-- `pop-limit` --- *100*
-
-  The number of cube-pruning hypotheses that are popped from the candidates 
list for each span of
-  the input.  Higher values result in a larger portion of the search space 
being explored at the
-  cost of an increased search time. For exhaustive search, set `pop-limit` to 
0.
-
-- `filter-grammar` --- false
-
-  Set to true, this enables dynamic sentence-level filtering. For each 
sentence, each grammar is
-  filtered at runtime down to rules that can be applied to the sentence under 
consideration. This
-  takes some time (which we haven't thoroughly quantified), but can result in 
the removal of many
-  rules that are only partially applicable to the sentence.
-
-- `constrain-parse` --- *false*
-- `use_pos_labels` --- *false*
-
-  *These features are not documented.*
-
-### Translation model options <a id="tm" />
-
-Joshua supports any number of translation models. Conventionally, two are 
supplied: the main grammar
-containing translation rules, and the glue grammar for patching things 
together. Internally, Joshua
-doesn't distinguish between the roles of these grammars; they are treated 
differently only in that
-they typically have different span limits (the maximum input width they can be 
applied to).
-
-Grammars are instantiated with config file lines of the following form:
-
-    tm = TYPE OWNER SPAN_LIMIT FILE
-
-* `TYPE` is the grammar type, which must be set to "thrax". 
-* `OWNER` is the grammar's owner, which defines the set of [feature 
weights](features.html) that
-  apply to the weights found in each line of the grammar (using different 
owners allows each grammar
-  to have different sets and numbers of weights, while sharing owners allows 
weights to be shared
-  across grammars).
-* `SPAN_LIMIT` is the maximum span of the input that rules from this grammar 
can be applied to. A
-  span limit of 0 means "no limit", while a span limit of -1 means that rules 
from this grammar must
-  be anchored to the left side of the sentence (index 0).
-* `FILE` is the path to the file containing the grammar. If the file is a 
directory, it is assumed
-  to be [packed](packed.html). Only one packed grammar can currently be used 
at a time.
-
-For reference, the following two translation model lines are used by the 
[pipeline](pipeline.html):
-
-    tm = thrax pt 20 /path/to/packed/grammar
-    tm = thrax glue -1 /path/to/glue/grammar
-
-### Language model options <a id="lm" />
-
-Joshua supports any number of language models. With Joshua 6.0, these
-are just regular feature functions:
-
-    feature-function = LanguageModel -lm_file /path/to/lm/file -lm_order N 
-lm_type TYPE
-    feature-function = StateMinimizingLanguageModel -lm_file /path/to/lm/file 
-lm_order N -lm_type TYPE
-
-`LanguageModel` is a generic language model, supporting types 'kenlm'
-(the default) and 'berkeleylm'. `StateMinimizingLanguageModel`
-implements LM state minimization to reduce the size of context n-grams
-where appropriate
-([Li and Khudanpur, 2008](http://www.aclweb.org/anthology/W08-0402.pdf);
-[Heafield et al., 2013](https://aclweb.org/anthology/N/N13/N13-1116.pdf)). This
-is currently only supported by KenLM, so the `-lm_type` option is not
-available here.
-
-The other key/value pairs are defined as follows:
-
-* `lm_type`: one of "kenlm" "berkeleylm"
-* `lm_order`: the order of the language model
-* `lm_file`: the path to the language model file.  All language model
-   types support the standard ARPA format.  Additionally, if the LM
-   type is "kenlm", this file can be compiled into KenLM's compiled
-   format (using the program at `$JOSHUA/bin/build_binary`); if the
-   the LM type is "berkeleylm", it can be compiled by following the
-   directions in
-   `$JOSHUA/src/joshua/decoder/ff/lm/berkeley_lm/README`. The
-   [pipeline](pipeline.html) will automatically compile either type.
-
-For each language model, you need to specify a feature weight in the following 
format:
-
-    lm_0 WEIGHT
-    lm_1 WEIGHT
-    ...
-
-where the indices correspond to the order of the language model declaration 
lines.
-
-### Output options <a id="output" />
-
-- `output-format` *New in 5.0*
-
-  Joshua prints a lot of information to STDERR (making this more granular is 
on the TODO
-  list). Output to STDOUT is reserved for decoder translations, and is 
controlled by the
-
-   - `%i`: the sentence number (0-indexed)
-
-   - `%e`: the source sentence
-
-   - `%s`: the translated sentence
-
-   - `%S`: the translated sentence, with some basic capitalization and 
denomralization. e.g.,
-
-         $ echo "¿ who you lookin' at , mr. ?" | $JOSHUA/bin/decoder 
-output-format "%S" -mark-oovs false 2> /dev/null 
-         ¿Who you lookin' at, Mr.? 
-
-   - `%t`: the target-side tree projection, all printed on one line (PTB style)
-   
-   - `%d`: the synchronous derivation, with each rules printed indented on 
their own lines
-
-   - `%f`: the list of feature values (as name=value pairs)
-
-   - `%c`: the model cost
-
-   - `%w`: the weight vector (unimplemented)
-
-   - `%a`: the alignments between source and target words (currently broken 
for hierarchical mode)
-
-  The default value is
-
-      output-format = %i ||| %s ||| %f ||| %c
-      
-  i.e.,
-
-      input ID ||| translation ||| model scores ||| score
-
-- `top-n` --- *300*
-
-  The number of translation hypotheses to output, sorted in decreasing order 
of model score
-
-- `use-unique-nbest` --- *true*
-
-  When constructing the n-best list for a sentence, skip hypotheses whose 
string has already been
-  output.
-
-- `escape-trees` --- *false*
-
-- `include-align-index` --- *false*
-
-  Output the source words indices that each target word aligns to.
-
-- `mark-oovs` --- *false*
-
-  if `true`, this causes the text "_OOV" to be appended to each untranslated 
word in the output.
-
-- `visualize-hypergraph` --- *false*
-
-  If set to true, a visualization of the hypergraph will be displayed, though 
you will have to
-  explicitly include the relevant jar files.  See the example usage in
-  `$JOSHUA/examples/tree_visualizer/`, which contains a demonstration of a 
source sentence,
-  translation, and synchronous derivation.
-
-- `dump-hypergraph` --- ""
-
-  This feature directs that the hypergraph should be written to disk for each 
input sentence. If
-  set, the value should contain the string "%d", which is replaced with the 
sentence number. For
-  example,
-  
-      cat input.txt | $JOSHUA/bin/decoder -dump-hypergraph hgs/%d.txt
-
-  Note that the output directory must exist.
-
-  TODO: revive the
-  [discussion on a common hypergraph 
format](http://aclweb.org/aclwiki/index.php?title=Hypergraph_Format)
-  on the ACL Wiki and support that format.
-
-### Lattice decoding
-
-In addition to regular sentences, Joshua can decode weighted lattices encoded 
in
-[the PLF format](http://www.statmt.org/moses/?n=Moses.WordLattices), except 
that path costs should
-be listed as <b>log probabilities</b> instead of probabilities.  Lattice 
decoding was originally
-added by Lane Schwartz and [Chris Dyer](http://www.cs.cmu.edu/~cdyer/).
-
-Joshua will automatically detect whether the input sentence is a regular 
sentence (the usual case)
-or a lattice.  If a lattice, a feature will be activated that accumulates the 
cost of different
-paths through the lattice.  In this case, you need to ensure that a weight for 
this feature is
-present in [your model file](decoder.html). The [pipeline](pipeline.html) will 
handle this
-automatically, or if you are doing this manually, you can add the line
-
-    SourcePath COST
-    
-to your Joshua configuration file.    
-
-Lattices must be listed one per line.
-
-### Alternate modes of operation <a id="modes" />
-
-In addition to decoding input sentences in the standard way, Joshua supports 
both *constrained
-decoding* and *synchronous parsing*. In both settings, both the source and 
target sides are provided
-as input, and the decoder finds a derivation between them.
-
-#### Constrained decoding
-
-To enable constrained decoding, simply append the desired target string as 
part of the input, in
-the following format:
-
-    source sentence ||| target sentence
-
-Joshua will translate the source sentence constrained to the target sentence. 
There are a few
-caveats:
-
-   * Left-state minimization cannot be enabled for the language model
-
-   * A heuristic is used to constrain the derivation (the LM state must match 
against the
-     input). This is not a perfect heuristic, and sometimes results in 
analyses that are not
-     perfectly constrained to the input, but have extra words.
-
-#### Synchronous parsing
-
-Joshua supports synchronous parsing as a two-step sequence of monolingual 
parses, as described in
-Dyer (NAACL 2010) ([PDF](http://www.aclweb.org/anthology/N10-1033‎.pdf)). To 
enable this:
-
-   - Set the configuration parameter `parse = true`.
-
-   - Remove all language models from the input file 
-
-   - Provide input in the following format:
-
-          source sentence ||| target sentence
-
-You may also wish to display the synchronouse parse tree (`-output-format %t`) 
and the alignment
-(`-show-align-index`).
-

Reply via email to