Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/DevGuide.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/DevGuide.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/DevGuide.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/DevGuide.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,32 +3,46 @@ Title: Lucy::Docs::DevGuide - Apache Luc
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Docs::DevGuide - Quick-start guide to hacking on Apache Lucy.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
 <p>The Apache Lucy code base is organized into roughly four layers:</p>
 
-<pre>   * Charmonizer - compiler and OS configuration probing.
-   * Clownfish - header files.
-   * C - implementation files.
-   * Host - binding language.</pre>
+<ul>
+<li>Charmonizer - compiler and OS configuration probing.</li>
 
-<p>Charmonizer is a configuration prober which writes a single header file, 
&#34;charmony.h&#34;, describing the build environment and facilitating 
cross-platform development. It&#39;s similar to Autoconf or Metaconfig, but 
written in pure C.</p>
+<li>Clownfish - header files.</li>
 
-<p>The &#34;.cfh&#34; files within the Lucy core are Clownfish header files. 
Clownfish is a purpose-built, declaration-only language which superimposes a 
single-inheritance object model on top of C which is specifically designed to 
co-exist happily with variety of &#34;host&#34; languages and to allow limited 
run-time dynamic subclassing. For more information see the Clownfish docs, but 
if there&#39;s one thing you should know about Clownfish OO before you start 
hacking, it&#39;s that method calls are differentiated from functions by 
capitalization:</p>
+<li>C - implementation files.</li>
+
+<li>Host - binding language.</li>
+</ul>
+
+<p>Charmonizer is a configuration prober which writes a single header file,
+&#8220;charmony.h&#8221;,
+describing the build environment and facilitating cross-platform development.
+It&#8217;s similar to Autoconf or Metaconfig,
+but written in pure C.</p>
+
+<p>The &#8220;.cfh&#8221; files within the Lucy core are Clownfish header 
files.
+Clownfish is a purpose-built,
+declaration-only language which superimposes a single-inheritance object model 
on top of C which is specifically designed to co-exist happily with variety of 
&#8220;host&#8221; languages and to allow limited run-time dynamic subclassing.
+For more information see the Clownfish docs,
+but if there&#8217;s one thing you should know about Clownfish OO before you 
start hacking,
+it&#8217;s that method calls are differentiated from functions by 
capitalization:</p>
 
 <pre>    Indexer_Add_Doc   &#60;-- Method, typically uses dynamic dispatch.
     Indexer_add_doc   &#60;-- Function, always a direct invocation.</pre>
 
-<p>The C files within the Lucy core are where most of Lucy&#39;s low-level 
functionality lies. They implement the interface defined by the Clownfish 
header files.</p>
+<p>The C files within the Lucy core are where most of Lucy&#8217;s low-level 
functionality lies. They implement the interface defined by the Clownfish 
header files.</p>
 
-<p>The C core is intentionally left incomplete, however; to be usable, it must 
be bound to a &#34;host&#34; language. (In this context, even C is considered a 
&#34;host&#34; which must implement the missing pieces and be &#34;bound&#34; 
to the core.) Some of the binding code is autogenerated by Clownfish on a spec 
customized for each language. Other pieces are hand-coded in either C (using 
the host&#39;s C API) or the host language itself.</p>
+<p>The C core is intentionally left incomplete, however; to be usable, it must 
be bound to a &#8220;host&#8221; language. (In this context, even C is 
considered a &#8220;host&#8221; which must implement the missing pieces and be 
&#8220;bound&#8221; to the core.) Some of the binding code is autogenerated by 
Clownfish on a spec customized for each language. Other pieces are hand-coded 
in either C (using the host&#8217;s C API) or the host language itself.</p>
 
 </div>

Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/DocIDs.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/DocIDs.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/DocIDs.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/DocIDs.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,19 +3,19 @@ Title: Lucy::Docs::DocIDs - Apache Lucy
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Docs::DocIDs - Characteristics of Apache Lucy document ids.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Document_ids_are_signed_32-bit_integers"
->Document ids are signed 32-bit integers</a></h2>
+>Document ids are signed 32-bit integers</a></h3>
 
 <p>Document ids in Apache Lucy start at 1.
 Because 0 is never a valid doc id,
@@ -25,9 +25,9 @@ we can use it as a sentinel value:</p>
         ...
     }</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Document_ids_are_ephemeral"
->Document ids are ephemeral</a></h2>
+>Document ids are ephemeral</a></h3>
 
 <p>The document ids used by Lucy are associated with a single index snapshot. 
The moment an index is updated, the mapping of document ids to documents is 
subject to change.</p>
 

Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/FileFormat.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/FileFormat.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/FileFormat.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/FileFormat.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,15 +3,15 @@ Title: Lucy::Docs::FileFormat - Apache L
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::FileFormat - Overview of index file format.</p>
+<p>Lucy::Docs::FileFormat - Overview of index file format</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="OVERVIEW"
->OVERVIEW</a></h1>
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
 
 <p>It is not necessary to understand the current implementation details of the 
index file format in order to use Apache Lucy effectively,
 but it may be helpful if you are interested in tweaking for high performance,
@@ -20,7 +20,7 @@ or debugging and development.</p>
 
 <p>On a file system,
 an index is a directory.
-The files inside have a hierarchical relationship: an index is made up of 
&#34;segments&#34;,
+The files inside have a hierarchical relationship: an index is made up of 
&#8220;segments&#8221;,
 each of which is an independent inverted index with its own subdirectory; each 
segment is made up of several component parts.</p>
 
 <pre>    [index]--|
@@ -50,72 +50,72 @@ each of which is an independent inverted
              |
              |--[...]--| </pre>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Write-once_philosophy"
->Write-once philosophy</a></h1>
+>Write-once philosophy</a></h3>
 
-<p>All segment directory names consist of the string &#34;seg_&#34; followed 
by a number in base 36: seg_1, seg_5m, seg_p9s2 and so on, with higher numbers 
indicating more recent segments. Once a segment is finished and committed, its 
name is never re-used and its files are never modified.</p>
+<p>All segment directory names consist of the string &#8220;seg_&#8221; 
followed by a number in base 36: seg_1, seg_5m, seg_p9s2 and so on, with higher 
numbers indicating more recent segments. Once a segment is finished and 
committed, its name is never re-used and its files are never modified.</p>
 
 <p>Old segments become obsolete and can be removed when their data has been 
consolidated into new segments during the process of segment merging and 
optimization. A fully-optimized index has only one segment.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Top-level_entries"
->Top-level entries</a></h1>
+>Top-level entries</a></h3>
 
-<p>There are a handful of &#34;top-level&#34; files and directories which 
belong to the entire index rather than to a particular segment.</p>
+<p>There are a handful of &#8220;top-level&#8221; files and directories which 
belong to the entire index rather than to a particular segment.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="snapshot_XXX.json"
->snapshot_XXX.json</a></h2>
+>snapshot_XXX.json</a></h4>
 
-<p>A &#34;snapshot&#34; file, e.g. <code>snapshot_m7p.json</code>, is list of 
index files and directories. Because index files, once written, are never 
modified, the list of entries in a snapshot defines a point-in-time view of the 
data in an index.</p>
+<p>A &#8220;snapshot&#8221; file, e.g. <code>snapshot_m7p.json</code>, is list 
of index files and directories. Because index files, once written, are never 
modified, the list of entries in a snapshot defines a point-in-time view of the 
data in an index.</p>
 
-<p>Like segment directories, snapshot files also utilize the 
unique-base-36-number naming convention; the higher the number, the more recent 
the file. The appearance of a new snapshot file within the index directory 
constitutes an index update. While a new segment is being written new files may 
be added to the index directory, but until a new snapshot file gets written, a 
Searcher opening the index for reading won&#39;t know about them.</p>
+<p>Like segment directories, snapshot files also utilize the 
unique-base-36-number naming convention; the higher the number, the more recent 
the file. The appearance of a new snapshot file within the index directory 
constitutes an index update. While a new segment is being written new files may 
be added to the index directory, but until a new snapshot file gets written, a 
Searcher opening the index for reading won&#8217;t know about them.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="schema_XXX.json"
->schema_XXX.json</a></h2>
+>schema_XXX.json</a></h4>
 
-<p>The schema file is a Schema object describing the index&#39;s format, 
serialized as JSON. It, too, is versioned, and a given snapshot file will 
reference one and only one schema file.</p>
+<p>The schema file is a Schema object describing the index&#8217;s format, 
serialized as JSON. It, too, is versioned, and a given snapshot file will 
reference one and only one schema file.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="locks"
->locks</a></h2>
+>locks</a></h4>
 
 <p>By default, only one indexing process may safely modify the index at any 
given time. Processes reserve an index by laying claim to the 
<code>write.lock</code> file within the <code>locks/</code> directory. A 
smattering of other lock files may be used from time to time, as well.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="A_segment&#39;s_component_parts"
->A segment&#39;s component parts</a></h1>
+<h3><a class='u'
+name="A_segment(8217)s_component_parts"
+>A segment&#8217;s component parts</a></h3>
 
-<p>By default, each segment has up to five logical components: lexicon, 
postings, document storage, highlight data, and deletions. Binary data from 
these components gets stored in virtual files within the &#34;cf.dat&#34; 
compound file; metadata is stored in a shared &#34;segmeta.json&#34; file.</p>
+<p>By default, each segment has up to five logical components: lexicon, 
postings, document storage, highlight data, and deletions. Binary data from 
these components gets stored in virtual files within the &#8220;cf.dat&#8221; 
compound file; metadata is stored in a shared &#8220;segmeta.json&#8221; 
file.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="segmeta.json"
->segmeta.json</a></h2>
+>segmeta.json</a></h4>
 
 <p>The segmeta.json file is a central repository for segment metadata. In 
addition to information such as document counts and field numbers, it also 
warehouses arbitrary metadata on behalf of individual index components.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="Lexicon"
->Lexicon</a></h2>
+>Lexicon</a></h4>
 
-<p>Each indexed field gets its own lexicon in each segment. The exact files 
involved depend on the field&#39;s type, but generally speaking there will be 
two parts. First, there&#39;s a primary <code>lexicon-XXX.dat</code> file which 
houses a complete term list associating terms with corpus frequency statistics, 
postings file locations, etc. Second, one or more &#34;lexicon index&#34; files 
may be present which contain periodic samples from the primary lexicon file to 
facilitate fast lookups.</p>
+<p>Each indexed field gets its own lexicon in each segment. The exact files 
involved depend on the field&#8217;s type, but generally speaking there will be 
two parts. First, there&#8217;s a primary <code>lexicon-XXX.dat</code> file 
which houses a complete term list associating terms with corpus frequency 
statistics, postings file locations, etc. Second, one or more &#8220;lexicon 
index&#8221; files may be present which contain periodic samples from the 
primary lexicon file to facilitate fast lookups.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="Postings"
->Postings</a></h2>
+>Postings</a></h4>
 
-<p>&#34;Posting&#34; is a technical term from the field of <a 
href="../../Lucy/Docs/IRTheory.html" class="podlinkpod"
->information retrieval</a>, defined as a single instance of a one term 
indexing one document. If you are looking at the index in the back of a book, 
and you see that &#34;freedom&#34; is referenced on pages 8, 86, and 240, that 
would be three postings, which taken together form a &#34;posting list&#34;. 
The same terminology applies to an index in electronic form.</p>
+<p>&#8220;Posting&#8221; is a technical term from the field of <a 
href="../../Lucy/Docs/IRTheory.html" class="podlinkpod"
+>information retrieval</a>, defined as a single instance of a one term 
indexing one document. If you are looking at the index in the back of a book, 
and you see that &#8220;freedom&#8221; is referenced on pages 8, 86, and 240, 
that would be three postings, which taken together form a &#8220;posting 
list&#8221;. The same terminology applies to an index in electronic form.</p>
 
 <p>Each segment has one postings file per indexed field. When a search is 
performed for a single term, first that term is looked up in the lexicon. If 
the term exists in the segment, the record in the lexicon will contain 
information about which postings file to look at and where to look.</p>
 
-<p>The first thing any posting record tells you is a document id. By iterating 
over all the postings associated with a term, you can find all the documents 
that match that term, a process which is analogous to looking up page numbers 
in a book&#39;s index. However, each posting record typically contains other 
information in addition to document id, e.g. the positions at which the term 
occurs within the field.</p>
+<p>The first thing any posting record tells you is a document id. By iterating 
over all the postings associated with a term, you can find all the documents 
that match that term, a process which is analogous to looking up page numbers 
in a book&#8217;s index. However, each posting record typically contains other 
information in addition to document id, e.g. the positions at which the term 
occurs within the field.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="Documents"
->Documents</a></h2>
+>Documents</a></h4>
 
 <p>The document storage section is a simple database, organized into two 
files:</p>
 
@@ -125,48 +125,48 @@ name="Documents"
 <li><b>documents.ix</b> - Document storage index, a solid array of 64-bit 
integers where each integer location corresponds to a document id, and the 
value at that location points at a file position in the documents.dat file.</li>
 </ul>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="Highlight_data"
->Highlight data</a></h2>
+>Highlight data</a></h4>
 
 <p>The files which store data used for excerpting and highlighting are 
organized similarly to the files used to store documents.</p>
 
 <ul>
 <li><b>highlight.dat</b> - Chunks of serialized highlight data, one per doc 
id.</li>
 
-<li><b>highlight.ix</b> - Highlight data index -- as with the 
<code>documents.ix</code> file, a solid array of 64-bit file pointers.</li>
+<li><b>highlight.ix</b> - Highlight data index &#8211; as with the 
<code>documents.ix</code> file, a solid array of 64-bit file pointers.</li>
 </ul>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h4><a class='u'
 name="Deletions"
->Deletions</a></h2>
+>Deletions</a></h4>
 
-<p>When a document is &#34;deleted&#34; from a segment, it is not actually 
purged right away; it is merely marked as &#34;deleted&#34; via a deletions 
file. Deletions files contains bit vectors with one bit for each document in 
the segment; if bit #254 is set then document 254 is deleted, and if that 
document turns up in a search it will be masked out.</p>
+<p>When a document is &#8220;deleted&#8221; from a segment, it is not actually 
purged right away; it is merely marked as &#8220;deleted&#8221; via a deletions 
file. Deletions files contains bit vectors with one bit for each document in 
the segment; if bit #254 is set then document 254 is deleted, and if that 
document turns up in a search it will be masked out.</p>
 
-<p>It is only when a segment&#39;s contents are rewritten to a new segment 
during the segment-merging process that deleted documents truly go away.</p>
+<p>It is only when a segment&#8217;s contents are rewritten to a new segment 
during the segment-merging process that deleted documents truly go away.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Compound_Files"
->Compound Files</a></h1>
+>Compound Files</a></h3>
 
-<p>If you peer inside an index directory, you won&#39;t actually find any 
files named &#34;documents.dat&#34;, &#34;highlight.ix&#34;, etc. unless there 
is an indexing process underway. What you will find instead is one 
&#34;cf.dat&#34; and one &#34;cfmeta.json&#34; file per segment.</p>
+<p>If you peer inside an index directory, you won&#8217;t actually find any 
files named &#8220;documents.dat&#8221;, &#8220;highlight.ix&#8221;, etc. 
unless there is an indexing process underway. What you will find instead is one 
&#8220;cf.dat&#8221; and one &#8220;cfmeta.json&#8221; file per segment.</p>
 
-<p>To minimize the need for file descriptors at search-time, all per-segment 
binary data files are concatenated together in &#34;cf.dat&#34; at the close of 
each indexing session. Information about where each file begins and ends is 
stored in <code>cfmeta.json</code>. When the segment is opened for reading, a 
single file descriptor per &#34;cf.dat&#34; file can be shared among several 
readers.</p>
+<p>To minimize the need for file descriptors at search-time, all per-segment 
binary data files are concatenated together in &#8220;cf.dat&#8221; at the 
close of each indexing session. Information about where each file begins and 
ends is stored in <code>cfmeta.json</code>. When the segment is opened for 
reading, a single file descriptor per &#8220;cf.dat&#8221; file can be shared 
among several readers.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="A_Typical_Search"
->A Typical Search</a></h1>
+>A Typical Search</a></h3>
 
-<p>Here&#39;s a simplified narrative, dramatizing how a search for 
&#34;freedom&#34; against a given segment plays out:</p>
+<p>Here&#8217;s a simplified narrative, dramatizing how a search for 
&#8220;freedom&#8221; against a given segment plays out:</p>
 
-<ol>
-<li>The searcher asks the relevant Lexicon Index, &#34;Do you know anything 
about &#39;freedom&#39;?&#34; Lexicon Index replies, &#34;Can&#39;t say for 
sure, but if the main Lexicon file does, &#39;freedom&#39; is probably 
somewhere around byte 21008&#34;.</li>
+<ul>
+<li>The searcher asks the relevant Lexicon Index, &#8220;Do you know anything 
about &#8216;freedom&#8217;?&#8221; Lexicon Index replies, &#8220;Can&#8217;t 
say for sure, but if the main Lexicon file does, &#8216;freedom&#8217; is 
probably somewhere around byte 21008&#8221;.</li>
 
-<li>The main Lexicon tells the searcher &#34;One moment, let me scan our 
records... Yes, we have 2 documents which contain &#39;freedom&#39;. You&#39;ll 
find them in seg_6/postings-4.dat starting at byte 66991.&#34;</li>
+<li>The main Lexicon tells the searcher &#8220;One moment, let me scan our 
records&#8230; Yes, we have 2 documents which contain &#8216;freedom&#8217;. 
You&#8217;ll find them in seg_6/postings-4.dat starting at byte 
66991.&#8221;</li>
 
-<li>The Postings file says &#34;Yep, we have &#39;freedom&#39;, all right! 
Document id 40 has 1 &#39;freedom&#39;, and document 44 has 8. If you need to 
know more, like if any &#39;freedom&#39; is part of the phrase &#39;freedom of 
speech&#39;, ask me about positions!</li>
+<li>The Postings file says &#8220;Yep, we have &#8216;freedom&#8217;, all 
right! Document id 40 has 1 &#8216;freedom&#8217;, and document 44 has 8. If 
you need to know more, like if any &#8216;freedom&#8217; is part of the phrase 
&#8216;freedom of speech&#8217;, ask me about positions!</li>
 
-<li>If the searcher is only looking for &#39;freedom&#39; in isolation, 
that&#39;s where it stops. It now knows enough to assign the documents scores 
against &#34;freedom&#34;, with the 8-freedom document likely ranking higher 
than the single-freedom document.</li>
-</ol>
+<li>If the searcher is only looking for &#8216;freedom&#8217; in isolation, 
that&#8217;s where it stops. It now knows enough to assign the documents scores 
against &#8220;freedom&#8221;, with the 8-freedom document likely ranking 
higher than the single-freedom document.</li>
+</ul>
 
 </div>

Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/FileLocking.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/FileLocking.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/FileLocking.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/FileLocking.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,26 +3,78 @@ Title: Lucy::Docs::FileLocking - Apache
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Docs::FileLocking - Manage indexes on shared volumes.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="SYNOPSIS"
->SYNOPSIS</a></h1>
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
+
+<p>Normally,
+index locking is an invisible process.
+Exclusive write access is controlled via lockfiles within the index directory 
and problems only arise if multiple processes attempt to acquire the write lock 
simultaneously; search-time processes do not ordinarily require locking at 
all.</p>
+
+<p>On shared volumes,
+however,
+the default locking mechanism fails,
+and manual intervention becomes necessary.</p>
+
+<p>Both read and write applications accessing an index on a shared volume need 
to identify themselves with a unique <code>host</code> id,
+e.g.
+hostname or ip address.
+Knowing the host id makes it possible to tell which lockfiles belong to other 
machines and therefore must not be removed when the lockfile&#8217;s pid number 
appears not to correspond to an active process.</p>
+
+<p>At index-time,
+the danger is that multiple indexing processes from different machines which 
fail to specify a unique <code>host</code> id can delete each others&#8217; 
lockfiles and then attempt to modify the index at the same time,
+causing index corruption.
+The search-time problem is more complex.</p>
+
+<p>Once an index file is no longer listed in the most recent snapshot,
+Indexer attempts to delete it as part of a post-<a href="lucy:Indexer.Commit" 
class="podlinkurl"
+>lucy:Indexer.Commit</a> cleanup routine.
+It is possible that at the moment an Indexer is deleting files which it 
believes no longer needed,
+a Searcher referencing an earlier snapshot is in fact using them.
+The more often that an index is either updated or searched,
+the more likely it is that this conflict will arise from time to time.</p>
+
+<p>Ordinarily,
+the deletion attempts are not a problem.
+On a typical unix volume,
+the files will be deleted in name only: any process which holds an open 
filehandle against a given file will continue to have access,
+and the file won&#8217;t actually get vaporized until the last filehandle is 
cleared.
+Thanks to &#8220;delete on last close semantics&#8221;,
+an Indexer can&#8217;t truly delete the file out from underneath an active 
Searcher.
+On Windows,
+where file deletion fails whenever any process holds an open handle,
+the situation is different but still workable: Indexer just keeps retrying 
after each commit until deletion finally succeeds.</p>
+
+<p>On NFS,
+however,
+the system breaks,
+because NFS allows files to be deleted out from underneath active processes.
+Should this happen,
+the unlucky read process will crash with a &#8220;Stale NFS filehandle&#8221; 
exception.</p>
+
+<p>Under normal circumstances,
+it is neither necessary nor desirable for IndexReaders to secure read locks 
against an index,
+but for NFS we have to make an exception.
+LockFactory&#8217;s <a href="lucy:LockFactory.Make_Shared_Lock" 
class="podlinkurl"
+>lucy:LockFactory.Make_Shared_Lock</a> method exists for this reason; 
supplying an IndexManager instance to IndexReader&#8217;s constructor activates 
an internal locking mechanism using <a href="lucy:LockFactory.Make_Shared_Lock" 
class="podlinkurl"
+>lucy:LockFactory.Make_Shared_Lock</a> which prevents concurrent indexing 
processes from deleting files that are needed by active readers.</p>
 
 <pre>    use Sys::Hostname qw( hostname );
     my $hostname = hostname() or die &#34;Can&#39;t get unique hostname&#34;;
     my $manager = Lucy::Index::IndexManager-&#62;new( host =&#62; $hostname );
-
+    
     # Index time:
     my $indexer = Lucy::Index::Indexer-&#62;new(
         index   =&#62; &#39;/path/to/index&#39;,
         manager =&#62; $manager,
     );
-
+    
     # Search time:
     my $reader = Lucy::Index::IndexReader-&#62;open(
         index   =&#62; &#39;/path/to/index&#39;,
@@ -30,26 +82,6 @@ name="SYNOPSIS"
     );
     my $searcher = Lucy::Search::IndexSearcher-&#62;new( index =&#62; $reader 
);</pre>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="DESCRIPTION"
->DESCRIPTION</a></h1>
-
-<p>Normally, index locking is an invisible process. Exclusive write access is 
controlled via lockfiles within the index directory and problems only arise if 
multiple processes attempt to acquire the write lock simultaneously; 
search-time processes do not ordinarily require locking at all.</p>
-
-<p>On shared volumes, however, the default locking mechanism fails, and manual 
intervention becomes necessary.</p>
-
-<p>Both read and write applications accessing an index on a shared volume need 
to identify themselves with a unique <code>host</code> id, e.g. hostname or ip 
address. Knowing the host id makes it possible to tell which lockfiles belong 
to other machines and therefore must not be removed when the lockfile&#39;s pid 
number appears not to correspond to an active process.</p>
-
-<p>At index-time, the danger is that multiple indexing processes from 
different machines which fail to specify a unique <code>host</code> id can 
delete each others&#39; lockfiles and then attempt to modify the index at the 
same time, causing index corruption. The search-time problem is more 
complex.</p>
-
-<p>Once an index file is no longer listed in the most recent snapshot, Indexer 
attempts to delete it as part of a post-commit() cleanup routine. It is 
possible that at the moment an Indexer is deleting files which it believes no 
longer needed, a Searcher referencing an earlier snapshot is in fact using 
them. The more often that an index is either updated or searched, the more 
likely it is that this conflict will arise from time to time.</p>
-
-<p>Ordinarily, the deletion attempts are not a problem. On a typical unix 
volume, the files will be deleted in name only: any process which holds an open 
filehandle against a given file will continue to have access, and the file 
won&#39;t actually get vaporized until the last filehandle is cleared. Thanks 
to &#34;delete on last close semantics&#34;, an Indexer can&#39;t truly delete 
the file out from underneath an active Searcher. On Windows, where file 
deletion fails whenever any process holds an open handle, the situation is 
different but still workable: Indexer just keeps retrying after each commit 
until deletion finally succeeds.</p>
-
-<p>On NFS, however, the system breaks, because NFS allows files to be deleted 
out from underneath active processes. Should this happen, the unlucky read 
process will crash with a &#34;Stale NFS filehandle&#34; exception.</p>
-
-<p>Under normal circumstances, it is neither necessary nor desirable for 
IndexReaders to secure read locks against an index, but for NFS we have to make 
an exception. LockFactory&#39;s make_shared_lock() method exists for this 
reason; supplying an IndexManager instance to IndexReader&#39;s constructor 
activates an internal locking mechanism using make_shared_lock() which prevents 
concurrent indexing processes from deleting files that are needed by active 
readers.</p>
-
-<p>Since shared locks are implemented using lockfiles located in the index 
directory (as are exclusive locks), reader applications must have write access 
for read locking to work. Stale lock files from crashed processes are 
ordinarily cleared away the next time the same machine -- as identified by the 
<code>host</code> parameter -- opens another IndexReader. (The classic 
technique of timing out lock files is not feasible because search processes may 
lie dormant indefinitely.) However, please be aware that if the last thing a 
given machine does is crash, lock files belonging to it may persist, preventing 
deletion of obsolete index data.</p>
+<p>Since shared locks are implemented using lockfiles located in the index 
directory (as are exclusive locks), reader applications must have write access 
for read locking to work. Stale lock files from crashed processes are 
ordinarily cleared away the next time the same machine &#8211; as identified by 
the <code>host</code> parameter &#8211; opens another IndexReader. (The classic 
technique of timing out lock files is not feasible because search processes may 
lie dormant indefinitely.) However, please be aware that if the last thing a 
given machine does is crash, lock files belonging to it may persist, preventing 
deletion of obsolete index data.</p>
 
 </div>

Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/IRTheory.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/IRTheory.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/IRTheory.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/IRTheory.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,25 +3,25 @@ Title: Lucy::Docs::IRTheory - Apache Luc
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::IRTheory - Crash course in information retrieval.</p>
+<p>Lucy::Docs::IRTheory - Crash course in information retrieval</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="ABSTRACT"
->ABSTRACT</a></h1>
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
 
 <p>Just enough Information Retrieval theory to find your way around Apache 
Lucy.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Terminology"
->Terminology</a></h1>
+>Terminology</a></h3>
 
 <p>Lucy uses some terminology from the field of information retrieval which 
may be unfamiliar to many users.
-&#34;Document&#34; and &#34;term&#34; mean pretty much what you&#39;d expect 
them to,
-but others such as &#34;posting&#34; and &#34;inverted index&#34; need a 
formal introduction:</p>
+&#8220;Document&#8221; and &#8220;term&#8221; mean pretty much what 
you&#8217;d expect them to,
+but others such as &#8220;posting&#8221; and &#8220;inverted index&#8221; need 
a formal introduction:</p>
 
 <ul>
 <li><i>document</i> - An atomic unit of retrieval.</li>
@@ -41,23 +41,21 @@ but others such as &#34;posting&#34; and
 it loads these abstract,
 distilled definitions down with useful traits.
 For instance,
-a &#34;posting&#34; in its most rarefied form is simply a term-document 
pairing; in Lucy,
-the class <a href="../../Lucy/Index/Posting/MatchPosting.html" 
class="podlinkpod"
->Lucy::Index::Posting::MatchPosting</a> fills this role.
+a &#8220;posting&#8221; in its most rarefied form is simply a term-document 
pairing; in Lucy,
+the class MatchPosting fills this role.
 However,
 by associating additional information with a posting like the number of times 
the term occurs in the document,
-we can turn it into a <a href="../../Lucy/Index/Posting/ScorePosting.html" 
class="podlinkpod"
->ScorePosting</a>,
+we can turn it into a ScorePosting,
 making it possible to rank documents by relevance rather than just list 
documents which happen to match in no particular order.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="TF/IDF_ranking_algorithm"
->TF/IDF ranking algorithm</a></h1>
+>TF/IDF ranking algorithm</a></h3>
 
-<p>Lucy uses a variant of the well-established &#34;Term Frequency / Inverse 
Document Frequency&#34; weighting scheme.
+<p>Lucy uses a variant of the well-established &#8220;Term Frequency / Inverse 
Document Frequency&#8221; weighting scheme.
 A thorough treatment of TF/IDF is too ambitious for our present purposes,
 but in a nutshell,
-it means that...</p>
+it means that&#8230;</p>
 
 <ul>
 <li>in a search for <code>skate park</code>,
@@ -66,6 +64,6 @@ documents which score well for the compa
 <li>a 10-word text which has one occurrence each of both <code>skate</code> 
and <code>park</code> will rank higher than a 1000-word text which also 
contains one occurrence of each.</li>
 </ul>
 
-<p>A web search for &#34;tf idf&#34; will turn up many excellent explanations 
of the algorithm.</p>
+<p>A web search for &#8220;tf idf&#8221; will turn up many excellent 
explanations of the algorithm.</p>
 
 </div>

Modified: lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,80 +3,74 @@ Title: Lucy::Docs::Tutorial - Apache Luc
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Docs::Tutorial - Step-by-step introduction to Apache Lucy.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="ABSTRACT"
->ABSTRACT</a></h1>
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
 
-<p>Explore Apache Lucy&#39;s basic functionality by starting with a minimalist 
CGI search app based on <a href="../../Lucy/Simple.html" class="podlinkpod"
->Lucy::Simple</a> and transforming it,
+<p>Explore Apache Lucy&#8217;s basic functionality by starting with a 
minimalist CGI search app based on Lucy::Simple and transforming it,
 step by step,
-into an &#34;advanced search&#34; interface utilizing more flexible core 
modules like <a href="../../Lucy/Index/Indexer.html" class="podlinkpod"
->Lucy::Index::Indexer</a> and <a href="../../Lucy/Search/IndexSearcher.html" 
class="podlinkpod"
->Lucy::Search::IndexSearcher</a>.</p>
-
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="DESCRIPTION"
->DESCRIPTION</a></h1>
+into an &#8220;advanced search&#8221; interface utilizing more flexible core 
modules like <a href="../../Lucy/Index/Indexer.html" class="podlinkpod"
+>Indexer</a> and <a href="../../Lucy/Search/IndexSearcher.html" 
class="podlinkpod"
+>IndexSearcher</a>.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Chapters"
->Chapters</a></h2>
+>Chapters</a></h3>
 
 <ul>
-<li><a href="../../Lucy/Docs/Tutorial/Simple.html" class="podlinkpod"
->Lucy::Docs::Tutorial::Simple</a> - Build a bare-bones search app using <a 
href="../../Lucy/Simple.html" class="podlinkpod"
->Lucy::Simple</a>.</li>
+<li><a href="../../Lucy/Docs/Tutorial/SimpleTutorial.html" class="podlinkpod"
+>SimpleTutorial</a> - Build a bare-bones search app using Lucy::Simple.</li>
 
-<li><a href="../../Lucy/Docs/Tutorial/BeyondSimple.html" class="podlinkpod"
->Lucy::Docs::Tutorial::BeyondSimple</a> - Rebuild the app using core classes 
like <a href="../../Lucy/Index/Indexer.html" class="podlinkpod"
+<li><a href="../../Lucy/Docs/Tutorial/BeyondSimpleTutorial.html" 
class="podlinkpod"
+>BeyondSimpleTutorial</a> - Rebuild the app using core classes like <a 
href="../../Lucy/Index/Indexer.html" class="podlinkpod"
 >Indexer</a> and <a href="../../Lucy/Search/IndexSearcher.html" 
 >class="podlinkpod"
 >IndexSearcher</a> in place of Lucy::Simple.</li>
 
-<li><a href="../../Lucy/Docs/Tutorial/FieldType.html" class="podlinkpod"
->Lucy::Docs::Tutorial::FieldType</a> - Experiment with different field 
characteristics using subclasses of <a href="../../Lucy/Plan/FieldType.html" 
class="podlinkpod"
->Lucy::Plan::FieldType</a>.</li>
-
-<li><a href="../../Lucy/Docs/Tutorial/Analysis.html" class="podlinkpod"
->Lucy::Docs::Tutorial::Analysis</a> - Examine how the choice of <a 
href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod"
->Lucy::Analysis::Analyzer</a> subclass affects search results.</li>
+<li><a href="../../Lucy/Docs/Tutorial/FieldTypeTutorial.html" 
class="podlinkpod"
+>FieldTypeTutorial</a> - Experiment with different field characteristics using 
subclasses of <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod"
+>FieldType</a>.</li>
+
+<li><a href="../../Lucy/Docs/Tutorial/AnalysisTutorial.html" class="podlinkpod"
+>AnalysisTutorial</a> - Examine how the choice of <a 
href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod"
+>Analyzer</a> subclass affects search results.</li>
 
-<li><a href="../../Lucy/Docs/Tutorial/Highlighter.html" class="podlinkpod"
->Lucy::Docs::Tutorial::Highlighter</a> - Augment search results with 
highlighted excerpts.</li>
+<li><a href="../../Lucy/Docs/Tutorial/HighlighterTutorial.html" 
class="podlinkpod"
+>HighlighterTutorial</a> - Augment search results with highlighted 
excerpts.</li>
 
-<li><a href="../../Lucy/Docs/Tutorial/QueryObjects.html" class="podlinkpod"
->Lucy::Docs::Tutorial::QueryObjects</a> - Unlock advanced search features by 
using Query objects instead of query strings.</li>
+<li><a href="../../Lucy/Docs/Tutorial/QueryObjectsTutorial.html" 
class="podlinkpod"
+>QueryObjectsTutorial</a> - Unlock advanced search features by using Query 
objects instead of query strings.</li>
 </ul>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Source_materials"
->Source materials</a></h2>
+>Source materials</a></h3>
 
-<p>The source material used by the tutorial app -- a multi-text-file 
presentation of the United States constitution -- can be found in the 
<code>sample</code> directory at the root of the Lucy distribution,
+<p>The source material used by the tutorial app &#8211; a multi-text-file 
presentation of the United States constitution &#8211; can be found in the 
<code>sample</code> directory at the root of the Lucy distribution,
 along with finished indexing and search apps.</p>
 
 <pre>    sample/indexer.pl        # indexing app
     sample/search.cgi        # search app
     sample/us_constitution   # corpus</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Conventions"
->Conventions</a></h2>
+>Conventions</a></h3>
 
 <p>The user is expected to be familiar with OO Perl and basic CGI 
programming.</p>
 
 <p>The code in this tutorial assumes a Unix-flavored operating system and the 
Apache webserver, but will work with minor modifications on other setups.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="SEE_ALSO"
->SEE ALSO</a></h1>
+<h3><a class='u'
+name="See_also"
+>See also</a></h3>
 
 <p>More advanced and esoteric subjects are covered in <a 
href="../../Lucy/Docs/Cookbook.html" class="podlinkpod"
->Lucy::Docs::Cookbook</a>.</p>
+>Cookbook</a>.</p>
 
 </div>

Copied: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/AnalysisTutorial.mdtext 
(from r1730822, 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Analysis.mdtext)
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/AnalysisTutorial.mdtext?p2=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/AnalysisTutorial.mdtext&p1=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Analysis.mdtext&r1=1730822&r2=1732471&rev=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Analysis.mdtext 
(original)
+++ 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/AnalysisTutorial.mdtext 
Fri Feb 26 12:52:25 2016
@@ -1,19 +1,20 @@
-Title: Lucy::Docs::Tutorial::Analysis - Apache Lucy Documentation
+Title: Lucy::Docs::Tutorial::AnalysisTutorial - Apache Lucy Documentation
 
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::Tutorial::Analysis - How to choose and use Analyzers.</p>
+<p>Lucy::Docs::Tutorial::AnalysisTutorial - How to choose and use 
Analyzers.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
-<p>Try swapping out the EasyAnalyzer in our Schema for a StandardTokenizer:</p>
+<p>Try swapping out the EasyAnalyzer in our Schema for a <a 
href="../../../Lucy/Analysis/StandardTokenizer.html" class="podlinkpod"
+>StandardTokenizer</a>:</p>
 
 <pre>    my $tokenizer = Lucy::Analysis::StandardTokenizer-&#62;new;
     my $type = Lucy::Plan::FullTextType-&#62;new(
@@ -24,13 +25,15 @@ name="DESCRIPTION"
 
 <p>Under EasyAnalyzer, the results are identical for all three searches, but 
under StandardTokenizer, searches are case-sensitive, and the result sets for 
<code>Senate</code> and <code>Senator</code> are distinct.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="EasyAnalyzer"
->EasyAnalyzer</a></h2>
+>EasyAnalyzer</a></h3>
 
-<p>What&#39;s happening is that EasyAnalyzer is performing more aggressive 
processing than StandardTokenizer. In addition to tokenizing, it&#39;s also 
converting all text to lower case so that searches are case-insensitive, and 
using a &#34;stemming&#34; algorithm to reduce related words to a common stem 
(<code>senat</code>, in this case).</p>
+<p>What&#8217;s happening is that <a 
href="../../../Lucy/Analysis/EasyAnalyzer.html" class="podlinkpod"
+>EasyAnalyzer</a> is performing more aggressive processing than 
StandardTokenizer. In addition to tokenizing, it&#8217;s also converting all 
text to lower case so that searches are case-insensitive, and using a 
&#8220;stemming&#8221; algorithm to reduce related words to a common stem 
(<code>senat</code>, in this case).</p>
 
-<p>EasyAnalyzer is actually multiple Analyzers wrapped up in a single package. 
In this case, it&#39;s three-in-one, since specifying a EasyAnalyzer with 
<code>language =&#62; &#39;en&#39;</code> is equivalent to this snippet:</p>
+<p>EasyAnalyzer is actually multiple Analyzers wrapped up in a single package. 
In this case, it&#8217;s three-in-one, since specifying a EasyAnalyzer with 
<code>language =&#62; &#39;en&#39;</code> is equivalent to this snippet 
creating a <a href="../../../Lucy/Analysis/PolyAnalyzer.html" class="podlinkpod"
+>PolyAnalyzer</a>:</p>
 
 <pre>    my $tokenizer    = Lucy::Analysis::StandardTokenizer-&#62;new;
     my $normalizer   = Lucy::Analysis::Normalizer-&#62;new;
@@ -39,7 +42,7 @@ name="EasyAnalyzer"
         analyzers =&#62; [ $tokenizer, $normalizer, $stemmer ],
     );</pre>
 
-<p>You can add or subtract Analyzers from there if you like. Try adding a 
fourth Analyzer, a SnowballStopFilter for suppressing &#34;stopwords&#34; like 
&#34;the&#34;, &#34;if&#34;, and &#34;maybe&#34;.</p>
+<p>You can add or subtract Analyzers from there if you like. Try adding a 
fourth Analyzer, a SnowballStopFilter for suppressing &#8220;stopwords&#8221; 
like &#8220;the&#8221;, &#8220;if&#8221;, and &#8220;maybe&#8221;.</p>
 
 <pre>    my $stopfilter = Lucy::Analysis::SnowballStopFilter-&#62;new( 
         language =&#62; &#39;en&#39;,
@@ -56,22 +59,22 @@ name="EasyAnalyzer"
 
 <p>The original choice of a stock English EasyAnalyzer probably still yields 
the best results for this document collection, but you get the idea: sometimes 
you want a different Analyzer.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="When_the_best_Analyzer_is_no_Analyzer"
->When the best Analyzer is no Analyzer</a></h2>
+>When the best Analyzer is no Analyzer</a></h3>
 
-<p>Sometimes you don&#39;t want an Analyzer at all. That was true for our 
&#34;url&#34; field because we didn&#39;t need it to be searchable, but 
it&#39;s also true for certain types of searchable fields. For instance, 
&#34;category&#34; fields are often set up to match exactly or not at all, as 
are fields like &#34;last_name&#34; (because you may not want to conflate 
results for &#34;Humphrey&#34; and &#34;Humphries&#34;).</p>
+<p>Sometimes you don&#8217;t want an Analyzer at all. That was true for our 
&#8220;url&#8221; field because we didn&#8217;t need it to be searchable, but 
it&#8217;s also true for certain types of searchable fields. For instance, 
&#8220;category&#8221; fields are often set up to match exactly or not at all, 
as are fields like &#8220;last_name&#8221; (because you may not want to 
conflate results for &#8220;Humphrey&#8221; and &#8220;Humphries&#8221;).</p>
 
 <p>To specify that there should be no analysis performed at all, use 
StringType:</p>
 
 <pre>    my $type = Lucy::Plan::StringType-&#62;new;
     $schema-&#62;spec_field( name =&#62; &#39;category&#39;, type =&#62; $type 
);</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Highlighting_up_next"
->Highlighting up next</a></h2>
+>Highlighting up next</a></h3>
 
-<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/Highlighter.html" class="podlinkpod"
->Lucy::Docs::Tutorial::Highlighter</a>, we&#39;ll add highlighted excerpts 
from the &#34;content&#34; field to our search results.</p>
+<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/HighlighterTutorial.html" class="podlinkpod"
+>HighlighterTutorial</a>, we&#8217;ll add highlighted excerpts from the 
&#8220;content&#8221; field to our search results.</p>
 
 </div>

Added: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext?rev=1732471&view=auto
==============================================================================
--- 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext
 (added)
+++ 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/BeyondSimpleTutorial.mdtext
 Fri Feb 26 12:52:25 2016
@@ -0,0 +1,139 @@
+Title: Lucy::Docs::Tutorial::BeyondSimpleTutorial - Apache Lucy Documentation
+
+<div>
+<a name='___top' class='dummyTopAnchor' ></a>
+
+<h2><a class='u'
+name="NAME"
+>NAME</a></h2>
+
+<p>Lucy::Docs::Tutorial::BeyondSimpleTutorial - A more flexible app 
structure.</p>
+
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
+
+<h3><a class='u'
+name="Goal"
+>Goal</a></h3>
+
+<p>In this tutorial chapter,
+we&#8217;ll refactor the apps we built in <a 
href="../../../Lucy/Docs/Tutorial/SimpleTutorial.html" class="podlinkpod"
+>SimpleTutorial</a> so that they look exactly the same from the end 
user&#8217;s point of view,
+but offer the developer greater possibilites for expansion.</p>
+
+<p>To achieve this,
+we&#8217;ll ditch Lucy::Simple and replace it with the classes that it uses 
internally:</p>
+
+<ul>
+<li><a href="../../../Lucy/Plan/Schema.html" class="podlinkpod"
+>Schema</a> - Plan out your index.</li>
+
+<li><a href="../../../Lucy/Plan/FullTextType.html" class="podlinkpod"
+>FullTextType</a> - Field type for full text search.</li>
+
+<li><a href="../../../Lucy/Analysis/EasyAnalyzer.html" class="podlinkpod"
+>EasyAnalyzer</a> - A one-size-fits-all parser/tokenizer.</li>
+
+<li><a href="../../../Lucy/Index/Indexer.html" class="podlinkpod"
+>Indexer</a> - Manipulate index content.</li>
+
+<li><a href="../../../Lucy/Search/IndexSearcher.html" class="podlinkpod"
+>IndexSearcher</a> - Search an index.</li>
+
+<li><a href="../../../Lucy/Search/Hits.html" class="podlinkpod"
+>Hits</a> - Iterate over hits returned by a Searcher.</li>
+</ul>
+
+<h3><a class='u'
+name="Adaptations_to_indexer.pl"
+>Adaptations to indexer.pl</a></h3>
+
+<p>After we load our modules&#8230;</p>
+
+<pre>    use Lucy::Plan::Schema;
+    use Lucy::Plan::FullTextType;
+    use Lucy::Analysis::EasyAnalyzer;
+    use Lucy::Index::Indexer;</pre>
+
+<p>&#8230; the first item we&#8217;re going need is a <a 
href="../../../Lucy/Plan/Schema.html" class="podlinkpod"
+>Schema</a>.</p>
+
+<p>The primary job of a Schema is to specify what fields are available and how 
they&#8217;re defined. We&#8217;ll start off with three fields: title, content 
and url.</p>
+
+<pre>    # Create Schema.
+    my $schema = Lucy::Plan::Schema-&#62;new;
+    my $easyanalyzer = Lucy::Analysis::EasyAnalyzer-&#62;new(
+        language =&#62; &#39;en&#39;,
+    );
+    my $type = Lucy::Plan::FullTextType-&#62;new(
+        analyzer =&#62; $easyanalyzer,
+    );
+    $schema-&#62;spec_field( name =&#62; &#39;title&#39;,   type =&#62; $type 
);
+    $schema-&#62;spec_field( name =&#62; &#39;content&#39;, type =&#62; $type 
);
+    $schema-&#62;spec_field( name =&#62; &#39;url&#39;,     type =&#62; $type 
);</pre>
+
+<p>All of the fields are spec&#8217;d out using the <a 
href="../../../Lucy/Plan/FullTextType.html" class="podlinkpod"
+>FullTextType</a> FieldType, indicating that they will be searchable as 
&#8220;full text&#8221; &#8211; which means that they can be searched for 
individual words. The &#8220;analyzer&#8221;, which is unique to FullTextType 
fields, is what breaks up the text into searchable tokens.</p>
+
+<p>Next, we&#8217;ll swap our Lucy::Simple object out for an <a 
href="../../../Lucy/Index/Indexer.html" class="podlinkpod"
+>Indexer</a>. The substitution will be straightforward because Simple has 
merely been serving as a thin wrapper around an inner Indexer, and we&#8217;ll 
just be peeling away the wrapper.</p>
+
+<p>First, replace the constructor:</p>
+
+<pre>    # Create Indexer.
+    my $indexer = Lucy::Index::Indexer-&#62;new(
+        index    =&#62; $path_to_index,
+        schema   =&#62; $schema,
+        create   =&#62; 1,
+        truncate =&#62; 1,
+    );</pre>
+
+<p>Next, have the <code>indexer</code> object <a 
href="../../../Lucy/Index/Indexer.html#add_doc" class="podlinkpod"
+>add_doc()</a> where we were having the <code>lucy</code> object adding the 
document before:</p>
+
+<pre>    foreach my $filename (@filenames) {
+        my $doc = parse_file($filename);
+        $indexer-&#62;add_doc($doc);
+    }</pre>
+
+<p>There&#8217;s only one extra step required: at the end of the app, you must 
call commit() explicitly to close the indexing session and commit your changes. 
(Lucy::Simple hides this detail, calling commit() implicitly when it needs 
to).</p>
+
+<pre>    $indexer-&#62;commit;</pre>
+
+<h3><a class='u'
+name="Adaptations_to_search.cgi"
+>Adaptations to search.cgi</a></h3>
+
+<p>In our search app as in our indexing app, Lucy::Simple has served as a thin 
wrapper &#8211; this time around <a 
href="../../../Lucy/Search/IndexSearcher.html" class="podlinkpod"
+>IndexSearcher</a> and <a href="../../../Lucy/Search/Hits.html" 
class="podlinkpod"
+>Hits</a>. Swapping out Simple for these two classes is also 
straightforward:</p>
+
+<pre>    use Lucy::Search::IndexSearcher;
+    
+    my $searcher = Lucy::Search::IndexSearcher-&#62;new( 
+        index =&#62; $path_to_index,
+    );
+    my $hits = $searcher-&#62;hits(    # returns a Hits object, not a hit count
+        query      =&#62; $q,
+        offset     =&#62; $offset,
+        num_wanted =&#62; $page_size,
+    );
+    my $hit_count = $hits-&#62;total_hits;  # get the hit count here
+    
+    ...
+    
+    while ( my $hit = $hits-&#62;next ) {
+        ...
+    }</pre>
+
+<h3><a class='u'
+name="Hooray!"
+>Hooray!</a></h3>
+
+<p>Congratulations! Your apps do the same thing as before&#8230; but now 
they&#8217;ll be easier to customize.</p>
+
+<p>In our next chapter, <a 
href="../../../Lucy/Docs/Tutorial/FieldTypeTutorial.html" class="podlinkpod"
+>FieldTypeTutorial</a>, we&#8217;ll explore how to assign different behaviors 
to different fields.</p>
+
+</div>

Added: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext?rev=1732471&view=auto
==============================================================================
--- 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext 
(added)
+++ 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/FieldTypeTutorial.mdtext 
Fri Feb 26 12:52:25 2016
@@ -0,0 +1,63 @@
+Title: Lucy::Docs::Tutorial::FieldTypeTutorial - Apache Lucy Documentation
+
+<div>
+<a name='___top' class='dummyTopAnchor' ></a>
+
+<h2><a class='u'
+name="NAME"
+>NAME</a></h2>
+
+<p>Lucy::Docs::Tutorial::FieldTypeTutorial - Specify per-field properties and 
behaviors.</p>
+
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
+
+<p>The Schema we used in the last chapter specifies three fields:</p>
+
+<pre>    my $type = Lucy::Plan::FullTextType-&#62;new(
+        analyzer =&#62; $easyanalyzer,
+    );
+    $schema-&#62;spec_field( name =&#62; &#39;title&#39;,   type =&#62; $type 
);
+    $schema-&#62;spec_field( name =&#62; &#39;content&#39;, type =&#62; $type 
);
+    $schema-&#62;spec_field( name =&#62; &#39;url&#39;,     type =&#62; $type 
);</pre>
+
+<p>Since they are all defined as &#8220;full text&#8221; fields, they are all 
searchable &#8211; including the <code>url</code> field, a dubious choice. Some 
URLs contain meaningful information, but these don&#8217;t, really:</p>
+
+<pre>    http://example.com/us_constitution/amend1.txt</pre>
+
+<p>We may as well not bother indexing the URL content. To achieve that we need 
to assign the <code>url</code> field to a different FieldType.</p>
+
+<h3><a class='u'
+name="StringType"
+>StringType</a></h3>
+
+<p>Instead of FullTextType, we&#8217;ll use a <a 
href="../../../Lucy/Plan/StringType.html" class="podlinkpod"
+>StringType</a>, which doesn&#8217;t use an Analyzer to break up text into 
individual fields. Furthermore, we&#8217;ll mark this StringType as unindexed, 
so that its content won&#8217;t be searchable at all.</p>
+
+<pre>    my $url_type = Lucy::Plan::StringType-&#62;new( indexed =&#62; 0 );
+    $schema-&#62;spec_field( name =&#62; &#39;url&#39;, type =&#62; $url_type 
);</pre>
+
+<p>To observe the change in behavior, try searching for 
<code>us_constitution</code> both before and after changing the Schema and 
re-indexing.</p>
+
+<h3><a class='u'
+name="Toggling_(8216)stored(8217)"
+>Toggling &#8216;stored&#8217;</a></h3>
+
+<p>For a taste of other FieldType possibilities, try turning off 
<code>stored</code> for one or more fields.</p>
+
+<pre>    my $content_type = Lucy::Plan::FullTextType-&#62;new(
+        analyzer =&#62; $easyanalyzer,
+        stored   =&#62; 0,
+    );</pre>
+
+<p>Turning off <code>stored</code> for either <code>title</code> or 
<code>url</code> mangles our results page, but since we&#8217;re not displaying 
<code>content</code>, turning it off for <code>content</code> has no effect 
&#8211; except on index size.</p>
+
+<h3><a class='u'
+name="Analyzers_up_next"
+>Analyzers up next</a></h3>
+
+<p>Analyzers play a crucial role in the behavior of FullTextType fields. In 
our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/AnalysisTutorial.html" class="podlinkpod"
+>AnalysisTutorial</a>, we&#8217;ll see how changing up the Analyzer changes 
search results.</p>
+
+</div>

Copied: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext 
(from r1730822, 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Highlighter.mdtext)
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext?p2=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext&p1=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Highlighter.mdtext&r1=1730822&r2=1732471&rev=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Highlighter.mdtext 
(original)
+++ 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/HighlighterTutorial.mdtext 
Fri Feb 26 12:52:25 2016
@@ -1,41 +1,41 @@
-Title: Lucy::Docs::Tutorial::Highlighter - Apache Lucy Documentation
+Title: Lucy::Docs::Tutorial::HighlighterTutorial - Apache Lucy Documentation
 
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::Tutorial::Highlighter - Augment search results with highlighted 
excerpts.</p>
+<p>Lucy::Docs::Tutorial::HighlighterTutorial - Augment search results with 
highlighted excerpts.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
 <p>Adding relevant excerpts with highlighted search terms to your search 
results display makes it much easier for end users to scan the page and assess 
which hits look promising,
 dramatically improving their search experience.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Adaptations_to_indexer.pl"
->Adaptations to indexer.pl</a></h2>
+>Adaptations to indexer.pl</a></h3>
 
 <p><a href="../../../Lucy/Highlight/Highlighter.html" class="podlinkpod"
->Lucy::Highlight::Highlighter</a> uses information generated at index time.
+>Highlighter</a> uses information generated at index time.
 To save resources,
 highlighting is disabled by default and must be turned on for individual 
fields.</p>
 
 <pre>    my $highlightable = Lucy::Plan::FullTextType-&#62;new(
-        analyzer      =&#62; $polyanalyzer,
+        analyzer      =&#62; $easyanalyzer,
         highlightable =&#62; 1,
     );
     $schema-&#62;spec_field( name =&#62; &#39;content&#39;, type =&#62; 
$highlightable );</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Adaptations_to_search.cgi"
->Adaptations to search.cgi</a></h2>
+>Adaptations to search.cgi</a></h3>
 
-<p>To add highlighting and excerpting to the search.cgi sample app, create a 
<code>$highlighter</code> object outside the hits iterating loop...</p>
+<p>To add highlighting and excerpting to the search.cgi sample app, create a 
<code>$highlighter</code> object outside the hits iterating loop&#8230;</p>
 
 <pre>    my $highlighter = Lucy::Highlight::Highlighter-&#62;new(
         searcher =&#62; $searcher,
@@ -43,7 +43,7 @@ name="Adaptations_to_search.cgi"
         field    =&#62; &#39;content&#39;
     );</pre>
 
-<p>... then modify the loop and the per-hit display to generate and include 
the excerpt.</p>
+<p>&#8230; then modify the loop and the per-hit display to generate and 
include the excerpt.</p>
 
 <pre>    # Create result list.
     my $report = &#39;&#39;;
@@ -62,12 +62,12 @@ name="Adaptations_to_search.cgi"
         |;
     }</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Next_chapter:_Query_objects"
->Next chapter: Query objects</a></h2>
+>Next chapter: Query objects</a></h3>
 
-<p>Our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/QueryObjects.html" class="podlinkpod"
->Lucy::Docs::Tutorial::QueryObjects</a>, illustrates how to build an 
&#34;advanced search&#34; interface using <a 
href="../../../Lucy/Search/Query.html" class="podlinkpod"
+<p>Our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/QueryObjectsTutorial.html" class="podlinkpod"
+>QueryObjectsTutorial</a>, illustrates how to build an &#8220;advanced 
search&#8221; interface using <a href="../../../Lucy/Search/Query.html" 
class="podlinkpod"
 >Query</a> objects instead of query strings.</p>
 
 </div>

Copied: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext
 (from r1730822, 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjects.mdtext)
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext?p2=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext&p1=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjects.mdtext&r1=1730822&r2=1732471&rev=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjects.mdtext 
(original)
+++ 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/QueryObjectsTutorial.mdtext
 Fri Feb 26 12:52:25 2016
@@ -1,23 +1,23 @@
-Title: Lucy::Docs::Tutorial::QueryObjects - Apache Lucy Documentation
+Title: Lucy::Docs::Tutorial::QueryObjectsTutorial - Apache Lucy Documentation
 
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::Tutorial::QueryObjects - Use Query objects instead of query 
strings.</p>
+<p>Lucy::Docs::Tutorial::QueryObjectsTutorial - Use Query objects instead of 
query strings.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
 <p>Until now,
 our search app has had only a single search box.
 In this tutorial chapter,
-we&#39;ll move towards an &#34;advanced search&#34; interface,
-by adding a &#34;category&#34; drop-down menu.
+we&#8217;ll move towards an &#8220;advanced search&#8221; interface,
+by adding a &#8220;category&#8221; drop-down menu.
 Three new classes will be required:</p>
 
 <ul>
@@ -29,23 +29,23 @@ Three new classes will be required:</p>
 >TermQuery</a> - Query for a specific term within a specific field.</li>
 
 <li><a href="../../../Lucy/Search/ANDQuery.html" class="podlinkpod"
->ANDQuery</a> - &#34;AND&#34; together multiple Query objects to produce an 
intersected result set.</li>
+>ANDQuery</a> - &#8220;AND&#8221; together multiple Query objects to produce 
an intersected result set.</li>
 </ul>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Adaptations_to_indexer.pl"
->Adaptations to indexer.pl</a></h2>
+>Adaptations to indexer.pl</a></h3>
 
-<p>Our new &#34;category&#34; field will be a StringType field rather than a 
FullTextType field,
+<p>Our new &#8220;category&#8221; field will be a StringType field rather than 
a FullTextType field,
 because we will only be looking for exact matches.
 It needs to be indexed,
-but since we won&#39;t display its value,
-it doesn&#39;t need to be stored.</p>
+but since we won&#8217;t display its value,
+it doesn&#8217;t need to be stored.</p>
 
 <pre>    my $cat_type = Lucy::Plan::StringType-&#62;new( stored =&#62; 0 );
     $schema-&#62;spec_field( name =&#62; &#39;category&#39;, type =&#62; 
$cat_type );</pre>
 
-<p>There will be three possible values: &#34;article&#34;, 
&#34;amendment&#34;, and &#34;preamble&#34;, which we&#39;ll hack out of the 
source file&#39;s name during our <code>parse_file</code> subroutine:</p>
+<p>There will be three possible values: &#8220;article&#8221;, 
&#8220;amendment&#8221;, and &#8220;preamble&#8221;, which we&#8217;ll hack out 
of the source file&#8217;s name during our <code>parse_file</code> 
subroutine:</p>
 
 <pre>    my $category
         = $filename =~ /art/      ? &#39;article&#39;
@@ -59,11 +59,11 @@ it doesn&#39;t need to be stored.</p>
         category =&#62; $category,
     };</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Adaptations_to_search.cgi"
->Adaptations to search.cgi</a></h2>
+>Adaptations to search.cgi</a></h3>
 
-<p>The &#34;category&#34; constraint will be added to our search interface 
using an HTML &#34;select&#34; element (this routine will need to be integrated 
into the HTML generation section of search.cgi):</p>
+<p>The &#8220;category&#8221; constraint will be added to our search interface 
using an HTML &#8220;select&#8221; element (this routine will need to be 
integrated into the HTML generation section of search.cgi):</p>
 
 <pre>    # Build up the HTML &#34;select&#34; object for the 
&#34;category&#34; field.
     sub generate_category_select {
@@ -80,7 +80,7 @@ name="Adaptations_to_search.cgi"
         return $select;
     }</pre>
 
-<p>We&#39;ll start off by loading our new modules and extracting our new CGI 
parameter.</p>
+<p>We&#8217;ll start off by loading our new modules and extracting our new CGI 
parameter.</p>
 
 <pre>    use Lucy::Search::QueryParser;
     use Lucy::Search::TermQuery;
@@ -90,7 +90,7 @@ name="Adaptations_to_search.cgi"
     
     my $category = decode( &#34;UTF-8&#34;, 
$cgi-&#62;param(&#39;category&#39;) || &#39;&#39; );</pre>
 
-<p>QueryParser&#39;s constructor requires a &#34;schema&#34; argument. We can 
get that from our IndexSearcher:</p>
+<p>QueryParser&#8217;s constructor requires a &#8220;schema&#8221; argument. 
We can get that from our IndexSearcher:</p>
 
 <pre>    # Create an IndexSearcher and a QueryParser.
     my $searcher = Lucy::Search::IndexSearcher-&#62;new( 
@@ -104,7 +104,7 @@ name="Adaptations_to_search.cgi"
 
 <pre>    my $query = $qparser-&#62;parse($q);</pre>
 
-<p>If the user has specified a category, we&#39;ll use an ANDQuery to join our 
parsed query together with a TermQuery representing the category.</p>
+<p>If the user has specified a category, we&#8217;ll use an ANDQuery to join 
our parsed query together with a TermQuery representing the category.</p>
 
 <pre>    if ($category) {
         my $category_query = Lucy::Search::TermQuery-&#62;new(
@@ -116,7 +116,7 @@ name="Adaptations_to_search.cgi"
         );
     }</pre>
 
-<p>Now when we execute the query...</p>
+<p>Now when we execute the query&#8230;</p>
 
 <pre>    # Execute the Query and get a Hits object.
     my $hits = $searcher-&#62;hits(
@@ -125,17 +125,54 @@ name="Adaptations_to_search.cgi"
         num_wanted =&#62; $page_size,
     );</pre>
 
-<p>... we&#39;ll get a result set which is the intersection of the parsed 
query and the category query.</p>
+<p>&#8230; we&#8217;ll get a result set which is the intersection of the 
parsed query and the category query.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
+name="Using_TermQuery_with_full_text_fields"
+>Using TermQuery with full text fields</a></h3>
+
+<p>When querying full text fields, the easiest way is to create query objects 
using QueryParser. But sometimes you want to create TermQuery for a single term 
in a FullTextType field directly. In this case, we have to run the search term 
through the field&#8217;s analyzer to make sure it gets normalized in the same 
way as the field&#8217;s content.</p>
+
+<pre>    sub make_term_query {
+        my ($field, $term) = @_;
+    
+        my $token;
+        my $type = $schema-&#62;fetch_type($field);
+    
+        if ( $type-&#62;isa(&#39;Lucy::Plan::FullTextType&#39;) ) {
+            # Run the term through the full text analysis chain.
+            my $analyzer = $type-&#62;get_analyzer;
+            my $tokens   = $analyzer-&#62;split($term);
+    
+            if ( @$tokens != 1 ) {
+                # If the term expands to more than one token, or no
+                # tokens at all, it will never match a token in the
+                # full text field.
+                return Lucy::Search::NoMatchQuery-&#62;new;
+            }
+    
+            $token = $tokens-&#62;[0];
+        }
+        else {
+            # Exact match for other types.
+            $token = $term;
+        }
+    
+        return Lucy::Search::TermQuery-&#62;new(
+            field =&#62; $field,
+            term  =&#62; $token,
+        );
+    }</pre>
+
+<h3><a class='u'
 name="Congratulations!"
->Congratulations!</a></h1>
+>Congratulations!</a></h3>
 
-<p>You&#39;ve made it to the end of the tutorial.</p>
+<p>You&#8217;ve made it to the end of the tutorial.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
-name="SEE_ALSO"
->SEE ALSO</a></h1>
+<h3><a class='u'
+name="See_Also"
+>See Also</a></h3>
 
 <p>For additional thematic documentation, see the Apache Lucy <a 
href="../../../Lucy/Docs/Cookbook.html" class="podlinkpod"
 >Cookbook</a>.</p>

Copied: 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/SimpleTutorial.mdtext 
(from r1730822, 
lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Simple.mdtext)
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/SimpleTutorial.mdtext?p2=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/SimpleTutorial.mdtext&p1=lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Simple.mdtext&r1=1730822&r2=1732471&rev=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/Simple.mdtext 
(original)
+++ lucy/site/trunk/content/docs/test/Lucy/Docs/Tutorial/SimpleTutorial.mdtext 
Fri Feb 26 12:52:25 2016
@@ -1,29 +1,33 @@
-Title: Lucy::Docs::Tutorial::Simple - Apache Lucy Documentation
+Title: Lucy::Docs::Tutorial::SimpleTutorial - Apache Lucy Documentation
 
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
-<p>Lucy::Docs::Tutorial::Simple - Bare-bones search app.</p>
+<p>Lucy::Docs::Tutorial::SimpleTutorial - Bare-bones search app.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
+name="DESCRIPTION"
+>DESCRIPTION</a></h2>
+
+<h3><a class='u'
 name="Setup"
->Setup</a></h2>
+>Setup</a></h3>
 
-<p>Copy the text presentation of the US Constitution from the 
<code>sample</code> directory of the Apache Lucy distribution to the base level 
of your web server&#39;s <code>htdocs</code> directory.</p>
+<p>Copy the text presentation of the US Constitution from the 
<code>sample</code> directory of the Apache Lucy distribution to the base level 
of your web server&#8217;s <code>htdocs</code> directory.</p>
 
 <pre>    $ cp -R sample/us_constitution /usr/local/apache2/htdocs/</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Indexing:_indexer.pl"
->Indexing: indexer.pl</a></h2>
+>Indexing: indexer.pl</a></h3>
 
-<p>Our first task will be to create an application called 
<code>indexer.pl</code> which builds a searchable &#34;inverted index&#34; from 
a collection of documents.</p>
+<p>Our first task will be to create an application called 
<code>indexer.pl</code> which builds a searchable &#8220;inverted index&#8221; 
from a collection of documents.</p>
 
-<p>After we specify some configuration variables and load all necessary 
modules...</p>
+<p>After we specify some configuration variables and load all necessary 
modules&#8230;</p>
 
 <pre>    #!/usr/local/bin/perl
     use strict;
@@ -32,18 +36,19 @@ name="Indexing:_indexer.pl"
     # (Change configuration variables as needed.)
     my $path_to_index = &#39;/path/to/index&#39;;
     my $uscon_source  = &#39;/usr/local/apache2/htdocs/us_constitution&#39;;
-
+    
     use Lucy::Simple;
     use File::Spec::Functions qw( catfile );</pre>
 
-<p>... we&#39;ll start by creating a Lucy::Simple object, telling it where 
we&#39;d like the index to be located and the language of the source 
material.</p>
+<p>&#8230; we&#8217;ll start by creating a <a href="../../../Lucy/Simple.html" 
class="podlinkpod"
+>Lucy::Simple</a> object, telling it where we&#8217;d like the index to be 
located and the language of the source material.</p>
 
 <pre>    my $lucy = Lucy::Simple-&#62;new(
         path     =&#62; $path_to_index,
         language =&#62; &#39;en&#39;,
     );</pre>
 
-<p>Next, we&#39;ll add a subroutine which parses our sample documents.</p>
+<p>Next, we&#8217;ll add a subroutine which parses our sample documents.</p>
 
 <pre>    # Parse a file from our US Constitution collection and return a 
hashref with
     # the fields title, body, and url.
@@ -63,25 +68,25 @@ name="Indexing:_indexer.pl"
         };
     }</pre>
 
-<p>Add some elementary directory reading code...</p>
+<p>Add some elementary directory reading code&#8230;</p>
 
 <pre>    # Collect names of source files.
     opendir( my $dh, $uscon_source )
         or die &#34;Couldn&#39;t opendir &#39;$uscon_source&#39;: $!&#34;;
     my @filenames = grep { $_ =~ /\.txt/ } readdir $dh;</pre>
 
-<p>... and now we&#39;re ready for the meat of indexer.pl -- which occupies 
exactly one line of code.</p>
+<p>&#8230; and now we&#8217;re ready for the meat of indexer.pl &#8211; which 
occupies exactly one line of code.</p>
 
 <pre>    foreach my $filename (@filenames) {
         my $doc = parse_file($filename);
         $lucy-&#62;add_doc($doc);  # ta-da!
     }</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
+<h3><a class='u'
 name="Search:_search.cgi"
->Search: search.cgi</a></h2>
+>Search: search.cgi</a></h3>
 
-<p>As with our indexing app, the bulk of the code in our search script 
won&#39;t be Lucy-specific.</p>
+<p>As with our indexing app, the bulk of the code in our search script 
won&#8217;t be Lucy-specific.</p>
 
 <p>The beginning is dedicated to CGI processing and configuration.</p>
 
@@ -91,7 +96,7 @@ name="Search:_search.cgi"
     
     # (Change configuration variables as needed.)
     my $path_to_index = &#39;/path/to/index&#39;;
-
+    
     use CGI;
     use List::Util qw( max min );
     use POSIX qw( ceil );
@@ -103,7 +108,7 @@ name="Search:_search.cgi"
     my $offset    = decode( &#34;UTF-8&#34;, $cgi-&#62;param(&#39;offset&#39;) 
|| 0 );
     my $page_size = 10;</pre>
 
-<p>Once that&#39;s out of the way, we create our Lucy::Simple object and feed 
it a query string.</p>
+<p>Once that&#8217;s out of the way, we create our Lucy::Simple object and 
feed it a query string.</p>
 
 <pre>    my $lucy = Lucy::Simple-&#62;new(
         path     =&#62; $path_to_index,
@@ -115,10 +120,13 @@ name="Search:_search.cgi"
         num_wanted =&#62; $page_size,
     );</pre>
 
-<p>The value returned by search() is the total number of documents in the 
collection which matched the query. We&#39;ll show this hit count to the user, 
and also use it in conjunction with the parameters <code>offset</code> and 
<code>num_wanted</code> to break up results into &#34;pages&#34; of manageable 
size.</p>
+<p>The value returned by <a href="../../../Lucy/Simple.html#search" 
class="podlinkpod"
+>search()</a> is the total number of documents in the collection which matched 
the query. We&#8217;ll show this hit count to the user, and also use it in 
conjunction with the parameters <code>offset</code> and <code>num_wanted</code> 
to break up results into &#8220;pages&#8221; of manageable size.</p>
 
-<p>Calling search() on our Simple object turns it into an iterator. Invoking 
next() now returns hits one at a time as <a 
href="../../../Lucy/Document/HitDoc.html" class="podlinkpod"
->Lucy::Document::HitDoc</a> objects, starting with the most relevant.</p>
+<p>Calling <a href="../../../Lucy/Simple.html#search" class="podlinkpod"
+>search()</a> on our Simple object turns it into an iterator. Invoking <a 
href="../../../Lucy/Simple.html#next" class="podlinkpod"
+>next()</a> now returns hits one at a time as <a 
href="../../../Lucy/Document/HitDoc.html" class="podlinkpod"
+>HitDoc</a> objects, starting with the most relevant.</p>
 
 <pre>    # Create result list.
     my $report = &#39;&#39;;
@@ -162,7 +170,7 @@ name="Search:_search.cgi"
             # Calculate the nums for the first and last hit to display.
             my $last_result = min( ( $offset + $page_size ), $total_hits );
             my $first_result = min( ( $offset + 1 ), $last_result );
-
+    
             # Display the result nums, start paging info.
             $paging_info = qq|
                 &#60;p&#62;
@@ -173,25 +181,25 @@ name="Search:_search.cgi"
                 &#60;p&#62;
                     Results Page:
                 |;
-
+    
             # Calculate first and last hits pages to display / link to.
             my $current_page = int( $first_result / $page_size ) + 1;
             my $last_page    = ceil( $total_hits / $page_size );
             my $first_page   = max( 1, ( $current_page - 9 ) );
             $last_page = min( $last_page, ( $current_page + 10 ) );
-
+    
             # Create a url for use in paging links.
             my $href = $cgi-&#62;url( -relative =&#62; 1 );
             $href .= &#34;?q=&#34; . CGI::escape($query_string);
             $href .= &#34;;offset=&#34; . CGI::escape($offset);
-
+    
             # Generate the &#34;Prev&#34; link.
             if ( $current_page &#62; 1 ) {
                 my $new_offset = ( $current_page - 2 ) * $page_size;
                 $href =~ s/(?&#60;=offset=)\d+/$new_offset/;
                 $paging_info .= qq|&#60;a href=&#34;$href&#34;&#62;&#38;lt;= 
Prev&#60;/a&#62;\n|;
             }
-
+    
             # Generate paging links.
             for my $page_num ( $first_page .. $last_page ) {
                 if ( $page_num == $current_page ) {
@@ -203,21 +211,21 @@ name="Search:_search.cgi"
                     $paging_info .= qq|&#60;a 
href=&#34;$href&#34;&#62;$page_num&#60;/a&#62;\n|;
                 }
             }
-
+    
             # Generate the &#34;Next&#34; link.
             if ( $current_page != $last_page ) {
                 my $new_offset = $current_page * $page_size;
                 $href =~ s/(?&#60;=offset=)\d+/$new_offset/;
                 $paging_info .= qq|&#60;a href=&#34;$href&#34;&#62;Next 
=&#38;gt;&#60;/a&#62;\n|;
             }
-
+    
             # Close tag.
             $paging_info .= &#34;&#60;/p&#62;\n&#34;;
         }
-
+    
         return $paging_info;
     }
-
+    
     # Print content to output.
     sub blast_out_content {
         my ( $query_string, $hit_list, $paging_info ) = @_;
@@ -269,13 +277,13 @@ name="Search:_search.cgi"
     |;
     }</pre>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="OK..._now_what?"
->OK... now what?</a></h2>
+<h3><a class='u'
+name="OK(8230)_now_what?"
+>OK&#8230; now what?</a></h3>
 
-<p>Lucy::Simple is perfectly adequate for some tasks, but it&#39;s not very 
flexible. Many people find that it doesn&#39;t do at least one or two things 
they can&#39;t live without.</p>
+<p>Lucy::Simple is perfectly adequate for some tasks, but it&#8217;s not very 
flexible. Many people find that it doesn&#8217;t do at least one or two things 
they can&#8217;t live without.</p>
 
-<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/BeyondSimple.html" class="podlinkpod"
->BeyondSimple</a>, we&#39;ll rewrite our indexing and search scripts using the 
classes that Lucy::Simple hides from view, opening up the possibilities for 
expansion; then, we&#39;ll spend the rest of the tutorial chapters exploring 
these possibilities.</p>
+<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/BeyondSimpleTutorial.html" class="podlinkpod"
+>BeyondSimpleTutorial</a>, we&#8217;ll rewrite our indexing and search scripts 
using the classes that Lucy::Simple hides from view, opening up the 
possibilities for expansion; then, we&#8217;ll spend the rest of the tutorial 
chapters exploring these possibilities.</p>
 
 </div>

Modified: lucy/site/trunk/content/docs/test/Lucy/Document/Doc.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Document/Doc.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Document/Doc.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Document/Doc.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,15 +3,15 @@ Title: Lucy::Document::Doc - Apache Lucy
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Document::Doc - A document.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="SYNOPSIS"
->SYNOPSIS</a></h1>
+>SYNOPSIS</a></h2>
 
 <pre>    my $doc = Lucy::Document::Doc-&#62;new(
         fields =&#62; { foo =&#62; &#39;foo foo&#39;, bar =&#62; &#39;bar 
bar&#39; },
@@ -23,55 +23,101 @@ name="SYNOPSIS"
 <pre>    $doc-&#62;{foo} = &#39;new value for field &#34;foo&#34;&#39;;
     print &#34;foo: $doc-&#62;{foo}\n&#34;;</pre>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
 <p>A Doc object is akin to a row in a database, in that it is made up of one 
or more fields, each of which has a value.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="CONSTRUCTORS"
->CONSTRUCTORS</a></h1>
+>CONSTRUCTORS</a></h2>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="new(_[labeled_params]_)"
->new( <i>[labeled params]</i> )</a></h2>
+<h3><a class='u'
+name="new"
+>new</a></h3>
 
 <pre>    my $doc = Lucy::Document::Doc-&#62;new(
         fields =&#62; { foo =&#62; &#39;foo foo&#39;, bar =&#62; &#39;bar 
bar&#39; },
     );</pre>
 
+<p>Create a new Document.</p>
+
 <ul>
 <li><b>fields</b> - Field-value pairs.</li>
 
 <li><b>doc_id</b> - Internal Lucy document id. Default of 0 (an invalid doc 
id).</li>
 </ul>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="METHODS"
->METHODS</a></h1>
+>METHODS</a></h2>
+
+<h3><a class='u'
+name="set_doc_id"
+>set_doc_id</a></h3>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="set_doc_id(doc_id)"
->set_doc_id(doc_id)</a></h2>
+<pre>    $doc-&#62;set_doc_id($doc_id);</pre>
 
 <p>Set internal Lucy document id.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="get_doc_id()"
->get_doc_id()</a></h2>
+<h3><a class='u'
+name="get_doc_id"
+>get_doc_id</a></h3>
+
+<pre>    my $retval = $doc-&#62;get_doc_id();</pre>
 
 <p>Retrieve internal Lucy document id.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="get_fields()"
->get_fields()</a></h2>
+<h3><a class='u'
+name="store"
+>store</a></h3>
+
+<pre>    $doc-&#62;store($field, $value);</pre>
+
+<p>Store a field value in the Doc.</p>
+
+<ul>
+<li><b>field</b> - The field name.</li>
+
+<li><b>value</b> - The value.</li>
+</ul>
+
+<h3><a class='u'
+name="get_fields"
+>get_fields</a></h3>
+
+<pre>    my $retval = $doc-&#62;get_fields();</pre>
+
+<p>Return the Doc&#8217;s backing fields hash.</p>
+
+<h3><a class='u'
+name="get_size"
+>get_size</a></h3>
+
+<pre>    my $retval = $doc-&#62;get_size();</pre>
+
+<p>Return the number of fields in the Doc.</p>
+
+<h3><a class='u'
+name="extract"
+>extract</a></h3>
+
+<pre>    my $retval = $doc-&#62;extract($field);</pre>
+
+<p>Retrieve the field&#8217;s value, or NULL if the field is not present.</p>
+
+<h3><a class='u'
+name="field_names"
+>field_names</a></h3>
+
+<pre>    my $retval = $doc-&#62;field_names();</pre>
 
-<p>Return the Doc&#39;s backing fields hash.</p>
+<p>Return a list of names of all fields present.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="INHERITANCE"
->INHERITANCE</a></h1>
+>INHERITANCE</a></h2>
 
 <p>Lucy::Document::Doc isa Clownfish::Obj.</p>
 

Modified: lucy/site/trunk/content/docs/test/Lucy/Document/HitDoc.mdtext
URL: 
http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/test/Lucy/Document/HitDoc.mdtext?rev=1732471&r1=1732470&r2=1732471&view=diff
==============================================================================
--- lucy/site/trunk/content/docs/test/Lucy/Document/HitDoc.mdtext (original)
+++ lucy/site/trunk/content/docs/test/Lucy/Document/HitDoc.mdtext Fri Feb 26 
12:52:25 2016
@@ -3,15 +3,15 @@ Title: Lucy::Document::HitDoc - Apache L
 <div>
 <a name='___top' class='dummyTopAnchor' ></a>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="NAME"
->NAME</a></h1>
+>NAME</a></h2>
 
 <p>Lucy::Document::HitDoc - A document read from an index.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="SYNOPSIS"
->SYNOPSIS</a></h1>
+>SYNOPSIS</a></h2>
 
 <pre>    while ( my $hit_doc = $hits-&#62;next ) {
         print &#34;$hit_doc-&#62;{title}\n&#34;;
@@ -19,31 +19,35 @@ name="SYNOPSIS"
         ...
     }</pre>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="DESCRIPTION"
->DESCRIPTION</a></h1>
+>DESCRIPTION</a></h2>
 
-<p>HitDoc is the search-time relative of the index-time class Doc; it is 
augmented by a numeric score attribute that Doc doesn&#39;t have.</p>
+<p>HitDoc is the search-time relative of the index-time class Doc; it is 
augmented by a numeric score attribute that Doc doesn&#8217;t have.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="METHODS"
->METHODS</a></h1>
+>METHODS</a></h2>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="set_score(score)"
->set_score(score)</a></h2>
+<h3><a class='u'
+name="set_score"
+>set_score</a></h3>
+
+<pre>    $hit_doc-&#62;set_score($score);</pre>
 
 <p>Set score attribute.</p>
 
-<h2><a class='u' href='#___top' title='click to go to top of document'
-name="get_score()"
->get_score()</a></h2>
+<h3><a class='u'
+name="get_score"
+>get_score</a></h3>
+
+<pre>    my $retval = $hit_doc-&#62;get_score();</pre>
 
 <p>Get score attribute.</p>
 
-<h1><a class='u' href='#___top' title='click to go to top of document'
+<h2><a class='u'
 name="INHERITANCE"
->INHERITANCE</a></h1>
+>INHERITANCE</a></h2>
 
 <p>Lucy::Document::HitDoc isa <a href="../../Lucy/Document/Doc.html" 
class="podlinkpod"
 >Lucy::Document::Doc</a> isa Clownfish::Obj.</p>



Reply via email to