svn commit: r984655 [2/8] - in /websites/staging/lucy/trunk/content: ./ docs/0.4.0/ docs/0.4.0/perl/ docs/0.4.0/perl/Lucy/ docs/0.4.0/perl/Lucy/Analysis/ docs/0.4.0/perl/Lucy/Docs/ docs/0.4.0/perl/Lucy/Docs/Cookbook/ docs/0.4.0/perl/Lucy/Docs/Tutorial/...

buildbot Mon, 04 Apr 2016 02:23:48 -0700

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileFormat.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileFormat.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileFormat.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,153 @@
+
+<html>
+<head>
+<title>Lucy::Docs::FileFormat - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::FileFormat - Overview of index file format.</p>
+
+<h1 id="OVERVIEW">OVERVIEW</h1>
+
+<p>It is not necessary to understand the current implementation details of the 
index file format in order to use Apache Lucy effectively, but it may be 
helpful if you are interested in tweaking for high performance, exotic usage, 
or debugging and development.</p>
+
+<p>On a file system, an index is a directory. The files inside have a 
hierarchical relationship: an index is made up of &quot;segments&quot;, each of 
which is an independent inverted index with its own subdirectory; each segment 
is made up of several component parts.</p>
+
+<pre><code>    [index]--|
+             |--snapshot_XXX.json
+             |--schema_XXX.json
+             |--write.lock
+             |
+             |--seg_1--|
+             |         |--segmeta.json
+             |         |--cfmeta.json
+             |         |--cf.dat-------|
+             |                         |--[lexicon]
+             |                         |--[postings]
+             |                         |--[documents]
+             |                         |--[highlight]
+             |                         |--[deletions]
+             |
+             |--seg_2--|
+             |         |--segmeta.json
+             |         |--cfmeta.json
+             |         |--cf.dat-------|
+             |                         |--[lexicon]
+             |                         |--[postings]
+             |                         |--[documents]
+             |                         |--[highlight]
+             |                         |--[deletions]
+             |
+             |--[...]--| </code></pre>
+
+<h1 id="Write-once-philosophy">Write-once philosophy</h1>
+
+<p>All segment directory names consist of the string &quot;seg_&quot; followed 
by a number in base 36: seg_1, seg_5m, seg_p9s2 and so on, with higher numbers 
indicating more recent segments. Once a segment is finished and committed, its 
name is never re-used and its files are never modified.</p>
+
+<p>Old segments become obsolete and can be removed when their data has been 
consolidated into new segments during the process of segment merging and 
optimization. A fully-optimized index has only one segment.</p>
+
+<h1 id="Top-level-entries">Top-level entries</h1>
+
+<p>There are a handful of &quot;top-level&quot; files and directories which 
belong to the entire index rather than to a particular segment.</p>
+
+<h2 id="snapshot_XXX.json">snapshot_XXX.json</h2>
+
+<p>A &quot;snapshot&quot; file, e.g. <code>snapshot_m7p.json</code>, is list 
of index files and directories. Because index files, once written, are never 
modified, the list of entries in a snapshot defines a point-in-time view of the 
data in an index.</p>
+
+<p>Like segment directories, snapshot files also utilize the 
unique-base-36-number naming convention; the higher the number, the more recent 
the file. The appearance of a new snapshot file within the index directory 
constitutes an index update. While a new segment is being written new files may 
be added to the index directory, but until a new snapshot file gets written, a 
Searcher opening the index for reading won&#39;t know about them.</p>
+
+<h2 id="schema_XXX.json">schema_XXX.json</h2>
+
+<p>The schema file is a Schema object describing the index&#39;s format, 
serialized as JSON. It, too, is versioned, and a given snapshot file will 
reference one and only one schema file.</p>
+
+<h2 id="locks">locks</h2>
+
+<p>By default, only one indexing process may safely modify the index at any 
given time. Processes reserve an index by laying claim to the 
<code>write.lock</code> file within the <code>locks/</code> directory. A 
smattering of other lock files may be used from time to time, as well.</p>
+
+<h1 id="A-segments-component-parts">A segment&#39;s component parts</h1>
+
+<p>By default, each segment has up to five logical components: lexicon, 
postings, document storage, highlight data, and deletions. Binary data from 
these components gets stored in virtual files within the &quot;cf.dat&quot; 
compound file; metadata is stored in a shared &quot;segmeta.json&quot; file.</p>
+
+<h2 id="segmeta.json">segmeta.json</h2>
+
+<p>The segmeta.json file is a central repository for segment metadata. In 
addition to information such as document counts and field numbers, it also 
warehouses arbitrary metadata on behalf of individual index components.</p>
+
+<h2 id="Lexicon">Lexicon</h2>
+
+<p>Each indexed field gets its own lexicon in each segment. The exact files 
involved depend on the field&#39;s type, but generally speaking there will be 
two parts. First, there&#39;s a primary <code>lexicon-XXX.dat</code> file which 
houses a complete term list associating terms with corpus frequency statistics, 
postings file locations, etc. Second, one or more &quot;lexicon index&quot; 
files may be present which contain periodic samples from the primary lexicon 
file to facilitate fast lookups.</p>
+
+<h2 id="Postings">Postings</h2>
+
+<p>&quot;Posting&quot; is a technical term from the field of <a 
href="../../Lucy/Docs/IRTheory.html">information retrieval</a>, defined as a 
single instance of a one term indexing one document. If you are looking at the 
index in the back of a book, and you see that &quot;freedom&quot; is referenced 
on pages 8, 86, and 240, that would be three postings, which taken together 
form a &quot;posting list&quot;. The same terminology applies to an index in 
electronic form.</p>
+
+<p>Each segment has one postings file per indexed field. When a search is 
performed for a single term, first that term is looked up in the lexicon. If 
the term exists in the segment, the record in the lexicon will contain 
information about which postings file to look at and where to look.</p>
+
+<p>The first thing any posting record tells you is a document id. By iterating 
over all the postings associated with a term, you can find all the documents 
that match that term, a process which is analogous to looking up page numbers 
in a book&#39;s index. However, each posting record typically contains other 
information in addition to document id, e.g. the positions at which the term 
occurs within the field.</p>
+
+<h2 id="Documents">Documents</h2>
+
+<p>The document storage section is a simple database, organized into two 
files:</p>
+
+<ul>
+
+<li><p><b>documents.dat</b> - Serialized documents.</p>
+
+</li>
+<li><p><b>documents.ix</b> - Document storage index, a solid array of 64-bit 
integers where each integer location corresponds to a document id, and the 
value at that location points at a file position in the documents.dat file.</p>
+
+</li>
+</ul>
+
+<h2 id="Highlight-data">Highlight data</h2>
+
+<p>The files which store data used for excerpting and highlighting are 
organized similarly to the files used to store documents.</p>
+
+<ul>
+
+<li><p><b>highlight.dat</b> - Chunks of serialized highlight data, one per doc 
id.</p>
+
+</li>
+<li><p><b>highlight.ix</b> - Highlight data index -- as with the 
<code>documents.ix</code> file, a solid array of 64-bit file pointers.</p>
+
+</li>
+</ul>
+
+<h2 id="Deletions">Deletions</h2>
+
+<p>When a document is &quot;deleted&quot; from a segment, it is not actually 
purged right away; it is merely marked as &quot;deleted&quot; via a deletions 
file. Deletions files contains bit vectors with one bit for each document in 
the segment; if bit #254 is set then document 254 is deleted, and if that 
document turns up in a search it will be masked out.</p>
+
+<p>It is only when a segment&#39;s contents are rewritten to a new segment 
during the segment-merging process that deleted documents truly go away.</p>
+
+<h1 id="Compound-Files">Compound Files</h1>
+
+<p>If you peer inside an index directory, you won&#39;t actually find any 
files named &quot;documents.dat&quot;, &quot;highlight.ix&quot;, etc. unless 
there is an indexing process underway. What you will find instead is one 
&quot;cf.dat&quot; and one &quot;cfmeta.json&quot; file per segment.</p>
+
+<p>To minimize the need for file descriptors at search-time, all per-segment 
binary data files are concatenated together in &quot;cf.dat&quot; at the close 
of each indexing session. Information about where each file begins and ends is 
stored in <code>cfmeta.json</code>. When the segment is opened for reading, a 
single file descriptor per &quot;cf.dat&quot; file can be shared among several 
readers.</p>
+
+<h1 id="A-Typical-Search">A Typical Search</h1>
+
+<p>Here&#39;s a simplified narrative, dramatizing how a search for 
&quot;freedom&quot; against a given segment plays out:</p>
+
+<ol>
+
+<li><p>The searcher asks the relevant Lexicon Index, &quot;Do you know 
anything about &#39;freedom&#39;?&quot; Lexicon Index replies, &quot;Can&#39;t 
say for sure, but if the main Lexicon file does, &#39;freedom&#39; is probably 
somewhere around byte 21008&quot;.</p>
+
+</li>
+<li><p>The main Lexicon tells the searcher &quot;One moment, let me scan our 
records... Yes, we have 2 documents which contain &#39;freedom&#39;. You&#39;ll 
find them in seg_6/postings-4.dat starting at byte 66991.&quot;</p>
+
+</li>
+<li><p>The Postings file says &quot;Yep, we have &#39;freedom&#39;, all right! 
Document id 40 has 1 &#39;freedom&#39;, and document 44 has 8. If you need to 
know more, like if any &#39;freedom&#39; is part of the phrase &#39;freedom of 
speech&#39;, ask me about positions!</p>
+
+</li>
+<li><p>If the searcher is only looking for &#39;freedom&#39; in isolation, 
that&#39;s where it stops. It now knows enough to assign the documents scores 
against &quot;freedom&quot;, with the 8-freedom document likely ranking higher 
than the single-freedom document.</p>
+
+</li>
+</ol>
+
+</body>
+</html>
+


Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileLocking.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileLocking.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/FileLocking.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,55 @@
+
+<html>
+<head>
+<title>Lucy::Docs::FileLocking - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::FileLocking - Manage indexes on shared volumes.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    use Sys::Hostname qw( hostname );
+    my $hostname = hostname() or die &quot;Can&#39;t get unique hostname&quot;;
+    my $manager = Lucy::Index::IndexManager-&gt;new( host =&gt; $hostname );
+
+    # Index time:
+    my $indexer = Lucy::Index::Indexer-&gt;new(
+        index   =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+
+    # Search time:
+    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index   =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( index =&gt; $reader 
);</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Normally, index locking is an invisible process. Exclusive write access is 
controlled via lockfiles within the index directory and problems only arise if 
multiple processes attempt to acquire the write lock simultaneously; 
search-time processes do not ordinarily require locking at all.</p>
+
+<p>On shared volumes, however, the default locking mechanism fails, and manual 
intervention becomes necessary.</p>
+
+<p>Both read and write applications accessing an index on a shared volume need 
to identify themselves with a unique <code>host</code> id, e.g. hostname or ip 
address. Knowing the host id makes it possible to tell which lockfiles belong 
to other machines and therefore must not be removed when the lockfile&#39;s pid 
number appears not to correspond to an active process.</p>
+
+<p>At index-time, the danger is that multiple indexing processes from 
different machines which fail to specify a unique <code>host</code> id can 
delete each others&#39; lockfiles and then attempt to modify the index at the 
same time, causing index corruption. The search-time problem is more 
complex.</p>
+
+<p>Once an index file is no longer listed in the most recent snapshot, Indexer 
attempts to delete it as part of a post-commit() cleanup routine. It is 
possible that at the moment an Indexer is deleting files which it believes no 
longer needed, a Searcher referencing an earlier snapshot is in fact using 
them. The more often that an index is either updated or searched, the more 
likely it is that this conflict will arise from time to time.</p>
+
+<p>Ordinarily, the deletion attempts are not a problem. On a typical unix 
volume, the files will be deleted in name only: any process which holds an open 
filehandle against a given file will continue to have access, and the file 
won&#39;t actually get vaporized until the last filehandle is cleared. Thanks 
to &quot;delete on last close semantics&quot;, an Indexer can&#39;t truly 
delete the file out from underneath an active Searcher. On Windows, where file 
deletion fails whenever any process holds an open handle, the situation is 
different but still workable: Indexer just keeps retrying after each commit 
until deletion finally succeeds.</p>
+
+<p>On NFS, however, the system breaks, because NFS allows files to be deleted 
out from underneath active processes. Should this happen, the unlucky read 
process will crash with a &quot;Stale NFS filehandle&quot; exception.</p>
+
+<p>Under normal circumstances, it is neither necessary nor desirable for 
IndexReaders to secure read locks against an index, but for NFS we have to make 
an exception. LockFactory&#39;s make_shared_lock() method exists for this 
reason; supplying an IndexManager instance to IndexReader&#39;s constructor 
activates an internal locking mechanism using make_shared_lock() which prevents 
concurrent indexing processes from deleting files that are needed by active 
readers.</p>
+
+<p>Since shared locks are implemented using lockfiles located in the index 
directory (as are exclusive locks), reader applications must have write access 
for read locking to work. Stale lock files from crashed processes are 
ordinarily cleared away the next time the same machine -- as identified by the 
<code>host</code> parameter -- opens another IndexReader. (The classic 
technique of timing out lock files is not feasible because search processes may 
lie dormant indefinitely.) However, please be aware that if the last thing a 
given machine does is crash, lock files belonging to it may persist, preventing 
deletion of obsolete index data.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/IRTheory.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/IRTheory.html 
(added)
+++ websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/IRTheory.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,64 @@
+
+<html>
+<head>
+<title>Lucy::Docs::IRTheory - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::IRTheory - Crash course in information retrieval.</p>
+
+<h1 id="ABSTRACT">ABSTRACT</h1>
+
+<p>Just enough Information Retrieval theory to find your way around Apache 
Lucy.</p>
+
+<h1 id="Terminology">Terminology</h1>
+
+<p>Lucy uses some terminology from the field of information retrieval which 
may be unfamiliar to many users. &quot;Document&quot; and &quot;term&quot; mean 
pretty much what you&#39;d expect them to, but others such as 
&quot;posting&quot; and &quot;inverted index&quot; need a formal 
introduction:</p>
+
+<ul>
+
+<li><p><i>document</i> - An atomic unit of retrieval.</p>
+
+</li>
+<li><p><i>term</i> - An attribute which describes a document.</p>
+
+</li>
+<li><p><i>posting</i> - One term indexing one document.</p>
+
+</li>
+<li><p><i>term list</i> - The complete list of terms which describe a 
document.</p>
+
+</li>
+<li><p><i>posting list</i> - The complete list of documents which a term 
indexes.</p>
+
+</li>
+<li><p><i>inverted index</i> - A data structure which maps from terms to 
documents.</p>
+
+</li>
+</ul>
+
+<p>Since Lucy is a practical implementation of IR theory, it loads these 
abstract, distilled definitions down with useful traits. For instance, a 
&quot;posting&quot; in its most rarefied form is simply a term-document 
pairing; in Lucy, the class <a 
href="../../Lucy/Index/Posting/MatchPosting.html">Lucy::Index::Posting::MatchPosting</a>
 fills this role. However, by associating additional information with a posting 
like the number of times the term occurs in the document, we can turn it into a 
<a href="../../Lucy/Index/Posting/ScorePosting.html">ScorePosting</a>, making 
it possible to rank documents by relevance rather than just list documents 
which happen to match in no particular order.</p>
+
+<h1 id="TF-IDF-ranking-algorithm">TF/IDF ranking algorithm</h1>
+
+<p>Lucy uses a variant of the well-established &quot;Term Frequency / Inverse 
Document Frequency&quot; weighting scheme. A thorough treatment of TF/IDF is 
too ambitious for our present purposes, but in a nutshell, it means that...</p>
+
+<ul>
+
+<li><p>in a search for <code>skate park</code>, documents which score well for 
the comparatively rare term <code>skate</code> will rank higher than documents 
which score well for the more common term <code>park</code>.</p>
+
+</li>
+<li><p>a 10-word text which has one occurrence each of both <code>skate</code> 
and <code>park</code> will rank higher than a 1000-word text which also 
contains one occurrence of each.</p>
+
+</li>
+</ul>
+
+<p>A web search for &quot;tf idf&quot; will turn up many excellent 
explanations of the algorithm.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial.html 
(added)
+++ websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,64 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial - Step-by-step introduction to Apache Lucy.</p>
+
+<h1 id="ABSTRACT">ABSTRACT</h1>
+
+<p>Explore Apache Lucy&#39;s basic functionality by starting with a minimalist 
CGI search app based on <a href="../../Lucy/Simple.html">Lucy::Simple</a> and 
transforming it, step by step, into an &quot;advanced search&quot; interface 
utilizing more flexible core modules like <a 
href="../../Lucy/Index/Indexer.html">Lucy::Index::Indexer</a> and <a 
href="../../Lucy/Search/IndexSearcher.html">Lucy::Search::IndexSearcher</a>.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<h2 id="Chapters">Chapters</h2>
+
+<ul>
+
+<li><p><a 
href="../../Lucy/Docs/Tutorial/Simple.html">Lucy::Docs::Tutorial::Simple</a> - 
Build a bare-bones search app using <a 
href="../../Lucy/Simple.html">Lucy::Simple</a>.</p>
+
+</li>
+<li><p><a 
href="../../Lucy/Docs/Tutorial/BeyondSimple.html">Lucy::Docs::Tutorial::BeyondSimple</a>
 - Rebuild the app using core classes like <a 
href="../../Lucy/Index/Indexer.html">Indexer</a> and <a 
href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a> in place of 
Lucy::Simple.</p>
+
+</li>
+<li><p><a 
href="../../Lucy/Docs/Tutorial/FieldType.html">Lucy::Docs::Tutorial::FieldType</a>
 - Experiment with different field characteristics using subclasses of <a 
href="../../Lucy/Plan/FieldType.html">Lucy::Plan::FieldType</a>.</p>
+
+</li>
+<li><p><a 
href="../../Lucy/Docs/Tutorial/Analysis.html">Lucy::Docs::Tutorial::Analysis</a>
 - Examine how the choice of <a 
href="../../Lucy/Analysis/Analyzer.html">Lucy::Analysis::Analyzer</a> subclass 
affects search results.</p>
+
+</li>
+<li><p><a 
href="../../Lucy/Docs/Tutorial/Highlighter.html">Lucy::Docs::Tutorial::Highlighter</a>
 - Augment search results with highlighted excerpts.</p>
+
+</li>
+<li><p><a 
href="../../Lucy/Docs/Tutorial/QueryObjects.html">Lucy::Docs::Tutorial::QueryObjects</a>
 - Unlock advanced search features by using Query objects instead of query 
strings.</p>
+
+</li>
+</ul>
+
+<h2 id="Source-materials">Source materials</h2>
+
+<p>The source material used by the tutorial app -- a multi-text-file 
presentation of the United States constitution -- can be found in the 
<code>sample</code> directory at the root of the Lucy distribution, along with 
finished indexing and search apps.</p>
+
+<pre><code>    sample/indexer.pl        # indexing app
+    sample/search.cgi        # search app
+    sample/us_constitution   # corpus</code></pre>
+
+<h2 id="Conventions">Conventions</h2>
+
+<p>The user is expected to be familiar with OO Perl and basic CGI 
programming.</p>
+
+<p>The code in this tutorial assumes a Unix-flavored operating system and the 
Apache webserver, but will work with minor modifications on other setups.</p>
+
+<h1 id="SEE-ALSO">SEE ALSO</h1>
+
+<p>More advanced and esoteric subjects are covered in <a 
href="../../Lucy/Docs/Cookbook.html">Lucy::Docs::Cookbook</a>.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Analysis.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Analysis.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Analysis.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,72 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::Analysis - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Analysis - How to choose and use Analyzers.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Try swapping out the EasyAnalyzer in our Schema for a StandardTokenizer:</p>
+
+<pre><code>    my $tokenizer = Lucy::Analysis::StandardTokenizer-&gt;new;
+    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $tokenizer,
+    );</code></pre>
+
+<p>Search for <code>senate</code>, <code>Senate</code>, and 
<code>Senator</code> before and after making the change and re-indexing.</p>
+
+<p>Under EasyAnalyzer, the results are identical for all three searches, but 
under StandardTokenizer, searches are case-sensitive, and the result sets for 
<code>Senate</code> and <code>Senator</code> are distinct.</p>
+
+<h2 id="EasyAnalyzer">EasyAnalyzer</h2>
+
+<p>What&#39;s happening is that EasyAnalyzer is performing more aggressive 
processing than StandardTokenizer. In addition to tokenizing, it&#39;s also 
converting all text to lower case so that searches are case-insensitive, and 
using a &quot;stemming&quot; algorithm to reduce related words to a common stem 
(<code>senat</code>, in this case).</p>
+
+<p>EasyAnalyzer is actually multiple Analyzers wrapped up in a single package. 
In this case, it&#39;s three-in-one, since specifying a EasyAnalyzer with 
<code>language =&gt; &#39;en&#39;</code> is equivalent to this snippet:</p>
+
+<pre><code>    my $tokenizer    = Lucy::Analysis::StandardTokenizer-&gt;new;
+    my $normalizer   = Lucy::Analysis::Normalizer-&gt;new;
+    my $stemmer      = Lucy::Analysis::SnowballStemmer-&gt;new( language =&gt; 
&#39;en&#39; );
+    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $tokenizer, $normalizer, $stemmer ],
+    );</code></pre>
+
+<p>You can add or subtract Analyzers from there if you like. Try adding a 
fourth Analyzer, a SnowballStopFilter for suppressing &quot;stopwords&quot; 
like &quot;the&quot;, &quot;if&quot;, and &quot;maybe&quot;.</p>
+
+<pre><code>    my $stopfilter = Lucy::Analysis::SnowballStopFilter-&gt;new( 
+        language =&gt; &#39;en&#39;,
+    );
+    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $tokenizer, $normalizer, $stopfilter, $stemmer ],
+    );</code></pre>
+
+<p>Also, try removing the SnowballStemmer.</p>
+
+<pre><code>    my $polyanalyzer = Lucy::Analysis::PolyAnalyzer-&gt;new(
+        analyzers =&gt; [ $tokenizer, $normalizer ],
+    );</code></pre>
+
+<p>The original choice of a stock English EasyAnalyzer probably still yields 
the best results for this document collection, but you get the idea: sometimes 
you want a different Analyzer.</p>
+
+<h2 id="When-the-best-Analyzer-is-no-Analyzer">When the best Analyzer is no 
Analyzer</h2>
+
+<p>Sometimes you don&#39;t want an Analyzer at all. That was true for our 
&quot;url&quot; field because we didn&#39;t need it to be searchable, but 
it&#39;s also true for certain types of searchable fields. For instance, 
&quot;category&quot; fields are often set up to match exactly or not at all, as 
are fields like &quot;last_name&quot; (because you may not want to conflate 
results for &quot;Humphrey&quot; and &quot;Humphries&quot;).</p>
+
+<p>To specify that there should be no analysis performed at all, use 
StringType:</p>
+
+<pre><code>    my $type = Lucy::Plan::StringType-&gt;new;
+    $schema-&gt;spec_field( name =&gt; &#39;category&#39;, type =&gt; $type 
);</code></pre>
+
+<h2 id="Highlighting-up-next">Highlighting up next</h2>
+
+<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/Highlighter.html">Lucy::Docs::Tutorial::Highlighter</a>,
 we&#39;ll add highlighted excerpts from the &quot;content&quot; field to our 
search results.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/BeyondSimple.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/BeyondSimple.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/BeyondSimple.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,124 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::BeyondSimple - Apache Lucy Perl 
Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::BeyondSimple - A more flexible app structure.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<h2 id="Goal">Goal</h2>
+
+<p>In this tutorial chapter, we&#39;ll refactor the apps we built in <a 
href="../../../Lucy/Docs/Tutorial/Simple.html">Lucy::Docs::Tutorial::Simple</a> 
so that they look exactly the same from the end user&#39;s point of view, but 
offer the developer greater possibilites for expansion.</p>
+
+<p>To achieve this, we&#39;ll ditch Lucy::Simple and replace it with the 
classes that it uses internally:</p>
+
+<ul>
+
+<li><p><a href="../../../Lucy/Plan/Schema.html">Lucy::Plan::Schema</a> - Plan 
out your index.</p>
+
+</li>
+<li><p><a 
href="../../../Lucy/Plan/FullTextType.html">Lucy::Plan::FullTextType</a> - 
Field type for full text search.</p>
+
+</li>
+<li><p><a 
href="../../../Lucy/Analysis/EasyAnalyzer.html">Lucy::Analysis::EasyAnalyzer</a>
 - A one-size-fits-all parser/tokenizer.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Index/Indexer.html">Lucy::Index::Indexer</a> - 
Manipulate index content.</p>
+
+</li>
+<li><p><a 
href="../../../Lucy/Search/IndexSearcher.html">Lucy::Search::IndexSearcher</a> 
- Search an index.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/Hits.html">Lucy::Search::Hits</a> - 
Iterate over hits returned by a Searcher.</p>
+
+</li>
+</ul>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p>After we load our modules...</p>
+
+<pre><code>    use Lucy::Plan::Schema;
+    use Lucy::Plan::FullTextType;
+    use Lucy::Analysis::EasyAnalyzer;
+    use Lucy::Index::Indexer;</code></pre>
+
+<p>... the first item we&#39;re going need is a <a 
href="../../../Lucy/Plan/Schema.html">Schema</a>.</p>
+
+<p>The primary job of a Schema is to specify what fields are available and how 
they&#39;re defined. We&#39;ll start off with three fields: title, content and 
url.</p>
+
+<pre><code>    # Create Schema.
+    my $schema = Lucy::Plan::Schema-&gt;new;
+    my $easyanalyzer = Lucy::Analysis::EasyAnalyzer-&gt;new(
+        language =&gt; &#39;en&#39;,
+    );
+    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $easyanalyzer,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;title&#39;,   type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;,     type =&gt; $type 
);</code></pre>
+
+<p>All of the fields are spec&#39;d out using the &quot;FullTextType&quot; 
FieldType, indicating that they will be searchable as &quot;full text&quot; -- 
which means that they can be searched for individual words. The 
&quot;analyzer&quot;, which is unique to FullTextType fields, is what breaks up 
the text into searchable tokens.</p>
+
+<p>Next, we&#39;ll swap our Lucy::Simple object out for a 
Lucy::Index::Indexer. The substitution will be straightforward because Simple 
has merely been serving as a thin wrapper around an inner Indexer, and 
we&#39;ll just be peeling away the wrapper.</p>
+
+<p>First, replace the constructor:</p>
+
+<pre><code>    # Create Indexer.
+    my $indexer = Lucy::Index::Indexer-&gt;new(
+        index    =&gt; $path_to_index,
+        schema   =&gt; $schema,
+        create   =&gt; 1,
+        truncate =&gt; 1,
+    );</code></pre>
+
+<p>Next, have the <code>$indexer</code> object <code>add_doc</code> where we 
were having the <code>$lucy</code> object <code>add_doc</code> before:</p>
+
+<pre><code>    foreach my $filename (@filenames) {
+        my $doc = parse_file($filename);
+        $indexer-&gt;add_doc($doc);
+    }</code></pre>
+
+<p>There&#39;s only one extra step required: at the end of the app, you must 
call commit() explicitly to close the indexing session and commit your changes. 
(Lucy::Simple hides this detail, calling commit() implicitly when it needs 
to).</p>
+
+<pre><code>    $indexer-&gt;commit;</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>In our search app as in our indexing app, Lucy::Simple has served as a thin 
wrapper -- this time around <a 
href="../../../Lucy/Search/IndexSearcher.html">Lucy::Search::IndexSearcher</a> 
and <a href="../../../Lucy/Search/Hits.html">Lucy::Search::Hits</a>. Swapping 
out Simple for these two classes is also straightforward:</p>
+
+<pre><code>    use Lucy::Search::IndexSearcher;
+    
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( 
+        index =&gt; $path_to_index,
+    );
+    my $hits = $searcher-&gt;hits(    # returns a Hits object, not a hit count
+        query      =&gt; $q,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );
+    my $hit_count = $hits-&gt;total_hits;  # get the hit count here
+    
+    ...
+    
+    while ( my $hit = $hits-&gt;next ) {
+        ...
+    }</code></pre>
+
+<h2 id="Hooray">Hooray!</h2>
+
+<p>Congratulations! Your apps do the same thing as before... but now 
they&#39;ll be easier to customize.</p>
+
+<p>In our next chapter, <a 
href="../../../Lucy/Docs/Tutorial/FieldType.html">Lucy::Docs::Tutorial::FieldType</a>,
 we&#39;ll explore how to assign different behaviors to different fields.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/FieldType.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/FieldType.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/FieldType.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,57 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::FieldType - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::FieldType - Specify per-field properties and 
behaviors.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>The Schema we used in the last chapter specifies three fields:</p>
+
+<pre><code>    my $type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $polyanalyzer,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;title&#39;,   type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; $type );
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;,     type =&gt; $type 
);</code></pre>
+
+<p>Since they are all defined as &quot;full text&quot; fields, they are all 
searchable -- including the <code>url</code> field, a dubious choice. Some URLs 
contain meaningful information, but these don&#39;t, really:</p>
+
+<pre><code>    http://example.com/us_constitution/amend1.txt</code></pre>
+
+<p>We may as well not bother indexing the URL content. To achieve that we need 
to assign the <code>url</code> field to a different FieldType.</p>
+
+<h2 id="StringType">StringType</h2>
+
+<p>Instead of FullTextType, we&#39;ll use a <a 
href="../../../Lucy/Plan/StringType.html">StringType</a>, which doesn&#39;t use 
an Analyzer to break up text into individual fields. Furthermore, we&#39;ll 
mark this StringType as unindexed, so that its content won&#39;t be searchable 
at all.</p>
+
+<pre><code>    my $url_type = Lucy::Plan::StringType-&gt;new( indexed =&gt; 0 
);
+    $schema-&gt;spec_field( name =&gt; &#39;url&#39;, type =&gt; $url_type 
);</code></pre>
+
+<p>To observe the change in behavior, try searching for 
<code>us_constitution</code> both before and after changing the Schema and 
re-indexing.</p>
+
+<h2 id="Toggling-stored">Toggling &#39;stored&#39;</h2>
+
+<p>For a taste of other FieldType possibilities, try turning off 
<code>stored</code> for one or more fields.</p>
+
+<pre><code>    my $content_type = Lucy::Plan::FullTextType-&gt;new(
+        analyzer =&gt; $polyanalyzer,
+        stored   =&gt; 0,
+    );</code></pre>
+
+<p>Turning off <code>stored</code> for either <code>title</code> or 
<code>url</code> mangles our results page, but since we&#39;re not displaying 
<code>content</code>, turning it off for <code>content</code> has no effect -- 
except on index size.</p>
+
+<h2 id="Analyzers-up-next">Analyzers up next</h2>
+
+<p>Analyzers play a crucial role in the behavior of FullTextType fields. In 
our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/Analysis.html">Lucy::Docs::Tutorial::Analysis</a>,
 we&#39;ll see how changing up the Analyzer changes search results.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Highlighter.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Highlighter.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Highlighter.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,63 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::Highlighter - Apache Lucy Perl 
Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Highlighter - Augment search results with highlighted 
excerpts.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Adding relevant excerpts with highlighted search terms to your search 
results display makes it much easier for end users to scan the page and assess 
which hits look promising, dramatically improving their search experience.</p>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p><a 
href="../../../Lucy/Highlight/Highlighter.html">Lucy::Highlight::Highlighter</a>
 uses information generated at index time. To save resources, highlighting is 
disabled by default and must be turned on for individual fields.</p>
+
+<pre><code>    my $highlightable = Lucy::Plan::FullTextType-&gt;new(
+        analyzer      =&gt; $polyanalyzer,
+        highlightable =&gt; 1,
+    );
+    $schema-&gt;spec_field( name =&gt; &#39;content&#39;, type =&gt; 
$highlightable );</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>To add highlighting and excerpting to the search.cgi sample app, create a 
<code>$highlighter</code> object outside the hits iterating loop...</p>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher =&gt; $searcher,
+        query    =&gt; $q,
+        field    =&gt; &#39;content&#39;
+    );</code></pre>
+
+<p>... then modify the loop and the per-hit display to generate and include 
the excerpt.</p>
+
+<pre><code>    # Create result list.
+    my $report = &#39;&#39;;
+    while ( my $hit = $hits-&gt;next ) {
+        my $score   = sprintf( &quot;%0.3f&quot;, $hit-&gt;get_score );
+        my $excerpt = $highlighter-&gt;create_excerpt($hit);
+        $report .= qq|
+            &lt;p&gt;
+              &lt;a 
href=&quot;$hit-&gt;{url}&quot;&gt;&lt;strong&gt;$hit-&gt;{title}&lt;/strong&gt;&lt;/a&gt;
+              &lt;em&gt;$score&lt;/em&gt;
+              &lt;br /&gt;
+              $excerpt
+              &lt;br /&gt;
+              &lt;span 
class=&quot;excerptURL&quot;&gt;$hit-&gt;{url}&lt;/span&gt;
+            &lt;/p&gt;
+        |;
+    }</code></pre>
+
+<h2 id="Next-chapter:-Query-objects">Next chapter: Query objects</h2>
+
+<p>Our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/QueryObjects.html">Lucy::Docs::Tutorial::QueryObjects</a>,
 illustrates how to build an &quot;advanced search&quot; interface using <a 
href="../../../Lucy/Search/Query.html">Query</a> objects instead of query 
strings.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/QueryObjects.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/QueryObjects.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/QueryObjects.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,130 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::QueryObjects - Apache Lucy Perl 
Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::QueryObjects - Use Query objects instead of query 
strings.</p>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Until now, our search app has had only a single search box. In this 
tutorial chapter, we&#39;ll move towards an &quot;advanced search&quot; 
interface, by adding a &quot;category&quot; drop-down menu. Three new classes 
will be required:</p>
+
+<ul>
+
+<li><p><a href="../../../Lucy/Search/QueryParser.html">QueryParser</a> - Turn 
a query string into a <a href="../../../Lucy/Search/Query.html">Query</a> 
object.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/TermQuery.html">TermQuery</a> - Query for 
a specific term within a specific field.</p>
+
+</li>
+<li><p><a href="../../../Lucy/Search/ANDQuery.html">ANDQuery</a> - 
&quot;AND&quot; together multiple Query objects to produce an intersected 
result set.</p>
+
+</li>
+</ul>
+
+<h2 id="Adaptations-to-indexer.pl">Adaptations to indexer.pl</h2>
+
+<p>Our new &quot;category&quot; field will be a StringType field rather than a 
FullTextType field, because we will only be looking for exact matches. It needs 
to be indexed, but since we won&#39;t display its value, it doesn&#39;t need to 
be stored.</p>
+
+<pre><code>    my $cat_type = Lucy::Plan::StringType-&gt;new( stored =&gt; 0 );
+    $schema-&gt;spec_field( name =&gt; &#39;category&#39;, type =&gt; 
$cat_type );</code></pre>
+
+<p>There will be three possible values: &quot;article&quot;, 
&quot;amendment&quot;, and &quot;preamble&quot;, which we&#39;ll hack out of 
the source file&#39;s name during our <code>parse_file</code> subroutine:</p>
+
+<pre><code>    my $category
+        = $filename =~ /art/      ? &#39;article&#39;
+        : $filename =~ /amend/    ? &#39;amendment&#39;
+        : $filename =~ /preamble/ ? &#39;preamble&#39;
+        :                           die &quot;Can&#39;t derive category for 
$filename&quot;;
+    return {
+        title    =&gt; $title,
+        content  =&gt; $bodytext,
+        url      =&gt; &quot;/us_constitution/$filename&quot;,
+        category =&gt; $category,
+    };</code></pre>
+
+<h2 id="Adaptations-to-search.cgi">Adaptations to search.cgi</h2>
+
+<p>The &quot;category&quot; constraint will be added to our search interface 
using an HTML &quot;select&quot; element (this routine will need to be 
integrated into the HTML generation section of search.cgi):</p>
+
+<pre><code>    # Build up the HTML &quot;select&quot; object for the 
&quot;category&quot; field.
+    sub generate_category_select {
+        my $cat = shift;
+        my $select = qq|
+          &lt;select name=&quot;category&quot;&gt;
+            &lt;option value=&quot;&quot;&gt;All Sections&lt;/option&gt;
+            &lt;option value=&quot;article&quot;&gt;Articles&lt;/option&gt;
+            &lt;option value=&quot;amendment&quot;&gt;Amendments&lt;/option&gt;
+          &lt;/select&gt;|;
+        if ($cat) {
+            $select =~ s/&quot;$cat&quot;/&quot;$cat&quot; selected/;
+        }
+        return $select;
+    }</code></pre>
+
+<p>We&#39;ll start off by loading our new modules and extracting our new CGI 
parameter.</p>
+
+<pre><code>    use Lucy::Search::QueryParser;
+    use Lucy::Search::TermQuery;
+    use Lucy::Search::ANDQuery;
+    
+    ... 
+    
+    my $category = decode( &quot;UTF-8&quot;, 
$cgi-&gt;param(&#39;category&#39;) || &#39;&#39; );</code></pre>
+
+<p>QueryParser&#39;s constructor requires a &quot;schema&quot; argument. We 
can get that from our IndexSearcher:</p>
+
+<pre><code>    # Create an IndexSearcher and a QueryParser.
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( 
+        index =&gt; $path_to_index, 
+    );
+    my $qparser  = Lucy::Search::QueryParser-&gt;new( 
+        schema =&gt; $searcher-&gt;get_schema,
+    );</code></pre>
+
+<p>Previously, we have been handing raw query strings to IndexSearcher. Behind 
the scenes, IndexSearcher has been using a QueryParser to turn those query 
strings into Query objects. Now, we will bring QueryParser into the foreground 
and parse the strings explicitly.</p>
+
+<pre><code>    my $query = $qparser-&gt;parse($q);</code></pre>
+
+<p>If the user has specified a category, we&#39;ll use an ANDQuery to join our 
parsed query together with a TermQuery representing the category.</p>
+
+<pre><code>    if ($category) {
+        my $category_query = Lucy::Search::TermQuery-&gt;new(
+            field =&gt; &#39;category&#39;, 
+            term  =&gt; $category,
+        );
+        $query = Lucy::Search::ANDQuery-&gt;new(
+            children =&gt; [ $query, $category_query ]
+        );
+    }</code></pre>
+
+<p>Now when we execute the query...</p>
+
+<pre><code>    # Execute the Query and get a Hits object.
+    my $hits = $searcher-&gt;hits(
+        query      =&gt; $query,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );</code></pre>
+
+<p>... we&#39;ll get a result set which is the intersection of the parsed 
query and the category query.</p>
+
+<h1 id="Congratulations">Congratulations!</h1>
+
+<p>You&#39;ve made it to the end of the tutorial.</p>
+
+<h1 id="SEE-ALSO">SEE ALSO</h1>
+
+<p>For additional thematic documentation, see the Apache Lucy <a 
href="../../../Lucy/Docs/Cookbook.html">Cookbook</a>.</p>
+
+<p>ANDQuery has a companion class, <a 
href="../../../Lucy/Search/ORQuery.html">ORQuery</a>, and a close relative, <a 
href="../../../Lucy/Search/RequiredOptionalQuery.html">RequiredOptionalQuery</a>.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Simple.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Simple.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Docs/Tutorial/Simple.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,275 @@
+
+<html>
+<head>
+<title>Lucy::Docs::Tutorial::Simple - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Docs::Tutorial::Simple - Bare-bones search app.</p>
+
+<h2 id="Setup">Setup</h2>
+
+<p>Copy the text presentation of the US Constitution from the 
<code>sample</code> directory of the Apache Lucy distribution to the base level 
of your web server&#39;s <code>htdocs</code> directory.</p>
+
+<pre><code>    $ cp -R sample/us_constitution 
/usr/local/apache2/htdocs/</code></pre>
+
+<h2 id="Indexing:-indexer.pl">Indexing: indexer.pl</h2>
+
+<p>Our first task will be to create an application called 
<code>indexer.pl</code> which builds a searchable &quot;inverted index&quot; 
from a collection of documents.</p>
+
+<p>After we specify some configuration variables and load all necessary 
modules...</p>
+
+<pre><code>    #!/usr/local/bin/perl
+    use strict;
+    use warnings;
+    
+    # (Change configuration variables as needed.)
+    my $path_to_index = &#39;/path/to/index&#39;;
+    my $uscon_source  = &#39;/usr/local/apache2/htdocs/us_constitution&#39;;
+
+    use Lucy::Simple;
+    use File::Spec::Functions qw( catfile );</code></pre>
+
+<p>... we&#39;ll start by creating a Lucy::Simple object, telling it where 
we&#39;d like the index to be located and the language of the source 
material.</p>
+
+<pre><code>    my $lucy = Lucy::Simple-&gt;new(
+        path     =&gt; $path_to_index,
+        language =&gt; &#39;en&#39;,
+    );</code></pre>
+
+<p>Next, we&#39;ll add a subroutine which parses our sample documents.</p>
+
+<pre><code>    # Parse a file from our US Constitution collection and return a 
hashref with
+    # the fields title, body, and url.
+    sub parse_file {
+        my $filename = shift;
+        my $filepath = catfile( $uscon_source, $filename );
+        open( my $fh, &#39;&lt;&#39;, $filepath ) or die &quot;Can&#39;t open 
&#39;$filepath&#39;: $!&quot;;
+        my $text = do { local $/; &lt;$fh&gt; };    # slurp file content
+        $text =~ /\A(.+?)^\s+(.*)/ms
+            or die &quot;Can&#39;t extract title/bodytext from 
&#39;$filepath&#39;&quot;;
+        my $title    = $1;
+        my $bodytext = $2;
+        return {
+            title    =&gt; $title,
+            content  =&gt; $bodytext,
+            url      =&gt; &quot;/us_constitution/$filename&quot;,
+        };
+    }</code></pre>
+
+<p>Add some elementary directory reading code...</p>
+
+<pre><code>    # Collect names of source files.
+    opendir( my $dh, $uscon_source )
+        or die &quot;Couldn&#39;t opendir &#39;$uscon_source&#39;: $!&quot;;
+    my @filenames = grep { $_ =~ /\.txt/ } readdir $dh;</code></pre>
+
+<p>... and now we&#39;re ready for the meat of indexer.pl -- which occupies 
exactly one line of code.</p>
+
+<pre><code>    foreach my $filename (@filenames) {
+        my $doc = parse_file($filename);
+        $lucy-&gt;add_doc($doc);  # ta-da!
+    }</code></pre>
+
+<h2 id="Search:-search.cgi">Search: search.cgi</h2>
+
+<p>As with our indexing app, the bulk of the code in our search script 
won&#39;t be Lucy-specific.</p>
+
+<p>The beginning is dedicated to CGI processing and configuration.</p>
+
+<pre><code>    #!/usr/local/bin/perl -T
+    use strict;
+    use warnings;
+    
+    # (Change configuration variables as needed.)
+    my $path_to_index = &#39;/path/to/index&#39;;
+
+    use CGI;
+    use List::Util qw( max min );
+    use POSIX qw( ceil );
+    use Encode qw( decode );
+    use Lucy::Simple;
+    
+    my $cgi       = CGI-&gt;new;
+    my $q         = decode( &quot;UTF-8&quot;, $cgi-&gt;param(&#39;q&#39;) || 
&#39;&#39; );
+    my $offset    = decode( &quot;UTF-8&quot;, 
$cgi-&gt;param(&#39;offset&#39;) || 0 );
+    my $page_size = 10;</code></pre>
+
+<p>Once that&#39;s out of the way, we create our Lucy::Simple object and feed 
it a query string.</p>
+
+<pre><code>    my $lucy = Lucy::Simple-&gt;new(
+        path     =&gt; $path_to_index,
+        language =&gt; &#39;en&#39;,
+    );
+    my $hit_count = $lucy-&gt;search(
+        query      =&gt; $q,
+        offset     =&gt; $offset,
+        num_wanted =&gt; $page_size,
+    );</code></pre>
+
+<p>The value returned by search() is the total number of documents in the 
collection which matched the query. We&#39;ll show this hit count to the user, 
and also use it in conjunction with the parameters <code>offset</code> and 
<code>num_wanted</code> to break up results into &quot;pages&quot; of 
manageable size.</p>
+
+<p>Calling search() on our Simple object turns it into an iterator. Invoking 
next() now returns hits one at a time as <a 
href="../../../Lucy/Document/HitDoc.html">Lucy::Document::HitDoc</a> objects, 
starting with the most relevant.</p>
+
+<pre><code>    # Create result list.
+    my $report = &#39;&#39;;
+    while ( my $hit = $lucy-&gt;next ) {
+        my $score = sprintf( &quot;%0.3f&quot;, $hit-&gt;get_score );
+        $report .= qq|
+            &lt;p&gt;
+              &lt;a 
href=&quot;$hit-&gt;{url}&quot;&gt;&lt;strong&gt;$hit-&gt;{title}&lt;/strong&gt;&lt;/a&gt;
+              &lt;em&gt;$score&lt;/em&gt;
+              &lt;br&gt;
+              &lt;span 
class=&quot;excerptURL&quot;&gt;$hit-&gt;{url}&lt;/span&gt;
+            &lt;/p&gt;
+            |;
+    }</code></pre>
+
+<p>The rest of the script is just text wrangling.</p>
+
+<pre><code>    
#---------------------------------------------------------------#
+    # No tutorial material below this point - just html generation. #
+    #---------------------------------------------------------------#
+    
+    # Generate paging links and hit count, print and exit.
+    my $paging_links = generate_paging_info( $q, $hit_count );
+    blast_out_content( $q, $report, $paging_links );
+    
+    # Create html fragment with links for paging through results n-at-a-time.
+    sub generate_paging_info {
+        my ( $query_string, $total_hits ) = @_;
+        my $escaped_q = CGI::escapeHTML($query_string);
+        my $paging_info;
+        if ( !length $query_string ) {
+            # No query?  No display.
+            $paging_info = &#39;&#39;;
+        }
+        elsif ( $total_hits == 0 ) {
+            # Alert the user that their search failed.
+            $paging_info
+                = qq|&lt;p&gt;No matches for 
&lt;strong&gt;$escaped_q&lt;/strong&gt;&lt;/p&gt;|;
+        }
+        else {
+            # Calculate the nums for the first and last hit to display.
+            my $last_result = min( ( $offset + $page_size ), $total_hits );
+            my $first_result = min( ( $offset + 1 ), $last_result );
+
+            # Display the result nums, start paging info.
+            $paging_info = qq|
+                &lt;p&gt;
+                    Results 
&lt;strong&gt;$first_result-$last_result&lt;/strong&gt; 
+                    of &lt;strong&gt;$total_hits&lt;/strong&gt; 
+                    for &lt;strong&gt;$escaped_q&lt;/strong&gt;.
+                &lt;/p&gt;
+                &lt;p&gt;
+                    Results Page:
+                |;
+
+            # Calculate first and last hits pages to display / link to.
+            my $current_page = int( $first_result / $page_size ) + 1;
+            my $last_page    = ceil( $total_hits / $page_size );
+            my $first_page   = max( 1, ( $current_page - 9 ) );
+            $last_page = min( $last_page, ( $current_page + 10 ) );
+
+            # Create a url for use in paging links.
+            my $href = $cgi-&gt;url( -relative =&gt; 1 );
+            $href .= &quot;?q=&quot; . CGI::escape($query_string);
+            $href .= &quot;;offset=&quot; . CGI::escape($offset);
+
+            # Generate the &quot;Prev&quot; link.
+            if ( $current_page &gt; 1 ) {
+                my $new_offset = ( $current_page - 2 ) * $page_size;
+                $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                $paging_info .= qq|&lt;a href=&quot;$href&quot;&gt;&amp;lt;= 
Prev&lt;/a&gt;\n|;
+            }
+
+            # Generate paging links.
+            for my $page_num ( $first_page .. $last_page ) {
+                if ( $page_num == $current_page ) {
+                    $paging_info .= qq|$page_num \n|;
+                }
+                else {
+                    my $new_offset = ( $page_num - 1 ) * $page_size;
+                    $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                    $paging_info .= qq|&lt;a 
href=&quot;$href&quot;&gt;$page_num&lt;/a&gt;\n|;
+                }
+            }
+
+            # Generate the &quot;Next&quot; link.
+            if ( $current_page != $last_page ) {
+                my $new_offset = $current_page * $page_size;
+                $href =~ s/(?&lt;=offset=)\d+/$new_offset/;
+                $paging_info .= qq|&lt;a href=&quot;$href&quot;&gt;Next 
=&amp;gt;&lt;/a&gt;\n|;
+            }
+
+            # Close tag.
+            $paging_info .= &quot;&lt;/p&gt;\n&quot;;
+        }
+
+        return $paging_info;
+    }
+
+    # Print content to output.
+    sub blast_out_content {
+        my ( $query_string, $hit_list, $paging_info ) = @_;
+        my $escaped_q = CGI::escapeHTML($query_string);
+        binmode( STDOUT, &quot;:encoding(UTF-8)&quot; );
+        print qq|Content-type: text/html; charset=UTF-8\n\n|;
+        print qq|
+    &lt;!DOCTYPE html PUBLIC &quot;-//W3C//DTD HTML 4.01 Transitional//EN&quot;
+        &quot;http://www.w3.org/TR/html4/loose.dtd&quot;&gt;
+    &lt;html&gt;
+    &lt;head&gt;
+      &lt;meta http-equiv=&quot;Content-type&quot; 
+        content=&quot;text/html;charset=UTF-8&quot;&gt;
+      &lt;link rel=&quot;stylesheet&quot; type=&quot;text/css&quot; 
+        href=&quot;/us_constitution/uscon.css&quot;&gt;
+      &lt;title&gt;Lucy: $escaped_q&lt;/title&gt;
+    &lt;/head&gt;
+    
+    &lt;body&gt;
+    
+      &lt;div id=&quot;navigation&quot;&gt;
+        &lt;form id=&quot;usconSearch&quot; action=&quot;&quot;&gt;
+          &lt;strong&gt;
+            Search the 
+            &lt;a href=&quot;/us_constitution/index.html&quot;&gt;US 
Constitution&lt;/a&gt;:
+          &lt;/strong&gt;
+          &lt;input type=&quot;text&quot; name=&quot;q&quot; id=&quot;q&quot; 
value=&quot;$escaped_q&quot;&gt;
+          &lt;input type=&quot;submit&quot; value=&quot;=&amp;gt;&quot;&gt;
+        &lt;/form&gt;
+      &lt;/div&gt;&lt;!--navigation--&gt;
+    
+      &lt;div id=&quot;bodytext&quot;&gt;
+    
+      $hit_list
+    
+      $paging_info
+    
+        &lt;p style=&quot;font-size: smaller; color: #666&quot;&gt;
+          &lt;em&gt;
+            Powered by &lt;a href=&quot;http://lucy.apache.org/&quot;
+            &gt;Apache 
Lucy&lt;small&gt;&lt;sup&gt;TM&lt;/sup&gt;&lt;/small&gt;&lt;/a&gt;
+          &lt;/em&gt;
+        &lt;/p&gt;
+      &lt;/div&gt;&lt;!--bodytext--&gt;
+    
+    &lt;/body&gt;
+    
+    &lt;/html&gt;
+    |;
+    }</code></pre>
+
+<h2 id="OK...-now-what">OK... now what?</h2>
+
+<p>Lucy::Simple is perfectly adequate for some tasks, but it&#39;s not very 
flexible. Many people find that it doesn&#39;t do at least one or two things 
they can&#39;t live without.</p>
+
+<p>In our next tutorial chapter, <a 
href="../../../Lucy/Docs/Tutorial/BeyondSimple.html">BeyondSimple</a>, 
we&#39;ll rewrite our indexing and search scripts using the classes that 
Lucy::Simple hides from view, opening up the possibilities for expansion; then, 
we&#39;ll spend the rest of the tutorial chapters exploring these 
possibilities.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/Doc.html
==============================================================================
--- websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/Doc.html 
(added)
+++ websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/Doc.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,68 @@
+
+<html>
+<head>
+<title>Lucy::Document::Doc - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Document::Doc - A document.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $doc = Lucy::Document::Doc-&gt;new(
+        fields =&gt; { foo =&gt; &#39;foo foo&#39;, bar =&gt; &#39;bar 
bar&#39; },
+    );
+    $indexer-&gt;add_doc($doc);</code></pre>
+
+<p>Doc objects allow access to field values via hashref overloading:</p>
+
+<pre><code>    $doc-&gt;{foo} = &#39;new value for field &quot;foo&quot;&#39;;
+    print &quot;foo: $doc-&gt;{foo}\n&quot;;</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>A Doc object is akin to a row in a database, in that it is made up of one 
or more fields, each of which has a value.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $doc = Lucy::Document::Doc-&gt;new(
+        fields =&gt; { foo =&gt; &#39;foo foo&#39;, bar =&gt; &#39;bar 
bar&#39; },
+    );</code></pre>
+
+<ul>
+
+<li><p><b>fields</b> - Field-value pairs.</p>
+
+</li>
+<li><p><b>doc_id</b> - Internal Lucy document id. Default of 0 (an invalid doc 
id).</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="set_doc_id-doc_id">set_doc_id(doc_id)</h2>
+
+<p>Set internal Lucy document id.</p>
+
+<h2 id="get_doc_id">get_doc_id()</h2>
+
+<p>Retrieve internal Lucy document id.</p>
+
+<h2 id="get_fields">get_fields()</h2>
+
+<p>Return the Doc&#39;s backing fields hash.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Document::Doc isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/HitDoc.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/HitDoc.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Document/HitDoc.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,42 @@
+
+<html>
+<head>
+<title>Lucy::Document::HitDoc - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Document::HitDoc - A document read from an index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    while ( my $hit_doc = $hits-&gt;next ) {
+        print &quot;$hit_doc-&gt;{title}\n&quot;;
+        print $hit_doc-&gt;get_score . &quot;\n&quot;;
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>HitDoc is the search-time relative of the index-time class Doc; it is 
augmented by a numeric score attribute that Doc doesn&#39;t have.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="set_score-score">set_score(score)</h2>
+
+<p>Set score attribute.</p>
+
+<h2 id="get_score">get_score()</h2>
+
+<p>Get score attribute.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Document::HitDoc isa <a 
href="../../Lucy/Document/Doc.html">Lucy::Document::Doc</a> isa 
Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Highlight/Highlighter.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Highlight/Highlighter.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Highlight/Highlighter.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,114 @@
+
+<html>
+<head>
+<title>Lucy::Highlight::Highlighter - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Highlight::Highlighter - Create and highlight excerpts.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher =&gt; $searcher,
+        query    =&gt; $query,
+        field    =&gt; &#39;body&#39;
+    );
+    my $hits = $searcher-&gt;hits( query =&gt; $query );
+    while ( my $hit = $hits-&gt;next ) {
+        my $excerpt = $highlighter-&gt;create_excerpt($hit);
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>The Highlighter can be used to select relevant snippets from a document, 
and to surround search terms with highlighting tags. It handles both stems and 
phrases correctly and efficiently, using special-purpose data generated at 
index-time.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $highlighter = Lucy::Highlight::Highlighter-&gt;new(
+        searcher       =&gt; $searcher,    # required
+        query          =&gt; $query,       # required
+        field          =&gt; &#39;content&#39;,    # required
+        excerpt_length =&gt; 150,          # default: 200
+    );</code></pre>
+
+<ul>
+
+<li><p><b>searcher</b> - An object which inherits from <a 
href="../../Lucy/Search/Searcher.html">Searcher</a>, such as an <a 
href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a>.</p>
+
+</li>
+<li><p><b>query</b> - Query object or a query string.</p>
+
+</li>
+<li><p><b>field</b> - The name of the field from which to draw the excerpt. 
The field must marked as be <code>highlightable</code> (see <a 
href="../../Lucy/Plan/FieldType.html">FieldType</a>).</p>
+
+</li>
+<li><p><b>excerpt_length</b> - Maximum length of the excerpt, in 
characters.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="create_excerpt-hit_doc">create_excerpt(hit_doc)</h2>
+
+<p>Take a HitDoc object and return a highlighted excerpt as a string if the 
HitDoc has a value for the specified <code>field</code>.</p>
+
+<h2 id="highlight-text">highlight(text)</h2>
+
+<p>Highlight a small section of text. By default, prepends pre-tag and appends 
post-tag. This method is called internally by create_excerpt() when assembling 
an excerpt.</p>
+
+<h2 id="encode-text">encode(text)</h2>
+
+<p>Encode text with HTML entities. This method is called internally by 
create_excerpt() for each text fragment when assembling an excerpt. A subclass 
can override this if the text should be encoded differently or not at all.</p>
+
+<h2 id="set_pre_tag-pre_tag">set_pre_tag(pre_tag)</h2>
+
+<p>Setter. The default value is &quot;&lt;strong&gt;&quot;.</p>
+
+<h2 id="get_pre_tag">get_pre_tag()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="set_post_tag-post_tag">set_post_tag(post_tag)</h2>
+
+<p>Setter. The default value is &quot;&lt;/strong&gt;&quot;.</p>
+
+<h2 id="get_post_tag">get_post_tag()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_searcher">get_searcher()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_query">get_query()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_compiler">get_compiler()</h2>
+
+<p>Accessor for the Lucy::Search::Compiler object derived from 
<code>query</code> and <code>searcher</code>.</p>
+
+<h2 id="get_excerpt_length">get_excerpt_length()</h2>
+
+<p>Accessor.</p>
+
+<h2 id="get_field">get_field()</h2>
+
+<p>Accessor.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Highlight::Highlighter isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/BackgroundMerger.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/BackgroundMerger.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/BackgroundMerger.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,72 @@
+
+<html>
+<head>
+<title>Lucy::Index::BackgroundMerger - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::BackgroundMerger - Consolidate index segments in the 
background.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $bg_merger = Lucy::Index::BackgroundMerger-&gt;new(
+        index  =&gt; &#39;/path/to/index&#39;,
+    );
+    $bg_merger-&gt;commit;</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Adding documents to an index is usually fast, but every once in a while the 
index must be compacted and an update takes substantially longer to complete. 
See <a 
href="../../Lucy/Docs/Cookbook/FastUpdates.html">Lucy::Docs::Cookbook::FastUpdates</a>
 for how to use this class to control worst-case index update performance.</p>
+
+<p>As with <a href="../../Lucy/Index/Indexer.html">Indexer</a>, see <a 
href="../../Lucy/Docs/FileLocking.html">Lucy::Docs::FileLocking</a> if your 
index is on a shared volume.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $bg_merger = Lucy::Index::BackgroundMerger-&gt;new(
+        index   =&gt; &#39;/path/to/index&#39;,    # required
+        manager =&gt; $manager             # default: created internally
+    );</code></pre>
+
+<p>Open a new BackgroundMerger.</p>
+
+<ul>
+
+<li><p><b>index</b> - Either a string filepath or a Folder.</p>
+
+</li>
+<li><p><b>manager</b> - An IndexManager. If not supplied, an IndexManager with 
a 10-second write lock timeout will be created.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="commit">commit()</h2>
+
+<p>Commit any changes made to the index. Until this is called, none of the 
changes made during an indexing session are permanent.</p>
+
+<p>Calls prepare_commit() implicitly if it has not already been called.</p>
+
+<h2 id="prepare_commit">prepare_commit()</h2>
+
+<p>Perform the expensive setup for commit() in advance, so that commit() 
completes quickly.</p>
+
+<p>Towards the end of prepare_commit(), the BackgroundMerger attempts to 
re-acquire the write lock, which is then held until commit() finishes and 
releases it.</p>
+
+<h2 id="optimize">optimize()</h2>
+
+<p>Optimize the index for search-time performance. This may take a while, as 
it can involve rewriting large amounts of data.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::BackgroundMerger isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataReader.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataReader.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataReader.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,101 @@
+
+<html>
+<head>
+<title>Lucy::Index::DataReader - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DataReader - Abstract base class for reading index data.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    # Abstract base class.</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DataReader is the companion class to <a 
href="../../Lucy/Index/DataWriter.html">DataWriter</a>. Every index component 
will implement one of each.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $reader = MyDataReader-&gt;new(
+        schema   =&gt; $seg_reader-&gt;get_schema,      # default undef
+        folder   =&gt; $seg_reader-&gt;get_folder,      # default undef
+        snapshot =&gt; $seg_reader-&gt;get_snapshot,    # default undef
+        segments =&gt; $seg_reader-&gt;get_segments,    # default undef
+        seg_tick =&gt; $seg_reader-&gt;get_seg_tick,    # default -1
+    );</code></pre>
+
+<ul>
+
+<li><p><b>schema</b> - A Schema.</p>
+
+</li>
+<li><p><b>folder</b> - A Folder.</p>
+
+</li>
+<li><p><b>snapshot</b> - A Snapshot.</p>
+
+</li>
+<li><p><b>segments</b> - An array of Segments.</p>
+
+</li>
+<li><p><b>seg_tick</b> - The array index of the Segment object within the 
<code>segments</code> array that this particular DataReader is assigned to, if 
any. A value of -1 indicates that no Segment should be assigned.</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="aggregator-labeled-params">aggregator( <i>[labeled params]</i> )</h2>
+
+<p>Create a reader which aggregates the output of several lower level readers. 
Return undef if such a reader is not valid.</p>
+
+<ul>
+
+<li><p><b>readers</b> - An array of DataReaders.</p>
+
+</li>
+<li><p><b>offsets</b> - Doc id start offsets for each reader.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="get_schema">get_schema()</h2>
+
+<p>Accessor for &quot;schema&quot; member var.</p>
+
+<h2 id="get_folder">get_folder()</h2>
+
+<p>Accessor for &quot;folder&quot; member var.</p>
+
+<h2 id="get_snapshot">get_snapshot()</h2>
+
+<p>Accessor for &quot;snapshot&quot; member var.</p>
+
+<h2 id="get_segments">get_segments()</h2>
+
+<p>Accessor for &quot;segments&quot; member var.</p>
+
+<h2 id="get_segment">get_segment()</h2>
+
+<p>Accessor for &quot;segment&quot; member var.</p>
+
+<h2 id="get_seg_tick">get_seg_tick()</h2>
+
+<p>Accessor for &quot;seg_tick&quot; member var.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DataReader isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataWriter.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataWriter.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DataWriter.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,144 @@
+
+<html>
+<head>
+<title>Lucy::Index::DataWriter - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DataWriter - Write data to an index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    # Abstract base class.</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DataWriter is an abstract base class for writing index data, generally in 
segment-sized chunks. Each component of an index -- e.g. stored fields, 
lexicon, postings, deletions -- is represented by a DataWriter/<a 
href="../../Lucy/Index/DataReader.html">DataReader</a> pair.</p>
+
+<p>Components may be specified per index by subclassing <a 
href="../../Lucy/Plan/Architecture.html">Architecture</a>.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $writer = MyDataWriter-&gt;new(
+        snapshot   =&gt; $snapshot,      # required
+        segment    =&gt; $segment,       # required
+        polyreader =&gt; $polyreader,    # required
+    );</code></pre>
+
+<ul>
+
+<li><p><b>snapshot</b> - The Snapshot that will be committed at the end of the 
indexing session.</p>
+
+</li>
+<li><p><b>segment</b> - The Segment in progress.</p>
+
+</li>
+<li><p><b>polyreader</b> - A PolyReader representing all existing data in the 
index. (If the index is brand new, the PolyReader will have no sub-readers).</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="add_inverted_doc-labeled-params">add_inverted_doc( <i>[labeled 
params]</i> )</h2>
+
+<p>Process a document, previously inverted by <code>inverter</code>.</p>
+
+<ul>
+
+<li><p><b>inverter</b> - An Inverter wrapping an inverted document.</p>
+
+</li>
+<li><p><b>doc_id</b> - Internal number assigned to this document within the 
segment.</p>
+
+</li>
+</ul>
+
+<h2 id="add_segment-labeled-params">add_segment( <i>[labeled params]</i> )</h2>
+
+<p>Add content from an existing segment into the one currently being 
written.</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to add.</p>
+
+</li>
+<li><p><b>doc_map</b> - An array of integers mapping old document ids to new. 
Deleted documents are mapped to 0, indicating that they should be skipped.</p>
+
+</li>
+</ul>
+
+<h2 id="finish">finish()</h2>
+
+<p>Complete the segment: close all streams, store metadata, etc.</p>
+
+<h2 id="format">format()</h2>
+
+<p>Every writer must specify a file format revision number, which should 
increment each time the format changes. Responsibility for revision checking is 
left to the companion DataReader.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="delete_segment-reader">delete_segment(reader)</h2>
+
+<p>Remove a segment&#39;s data. The default implementation is a no-op, as all 
files within the segment directory will be automatically deleted. Subclasses 
which manage their own files outside of the segment system should override this 
method and use it as a trigger for cleaning up obsolete data.</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to merge, which must 
represent a segment which is part of the the current snapshot.</p>
+
+</li>
+</ul>
+
+<h2 id="merge_segment-labeled-params">merge_segment( <i>[labeled params]</i> 
)</h2>
+
+<p>Move content from an existing segment into the one currently being 
written.</p>
+
+<p>The default implementation calls add_segment() then delete_segment().</p>
+
+<ul>
+
+<li><p><b>reader</b> - The SegReader containing content to merge, which must 
represent a segment which is part of the the current snapshot.</p>
+
+</li>
+<li><p><b>doc_map</b> - An array of integers mapping old document ids to new. 
Deleted documents are mapped to 0, indicating that they should be skipped.</p>
+
+</li>
+</ul>
+
+<h2 id="metadata">metadata()</h2>
+
+<p>Arbitrary metadata to be serialized and stored by the Segment. The default 
implementation supplies a Hash with a single key-value pair for 
&quot;format&quot;.</p>
+
+<h2 id="get_snapshot">get_snapshot()</h2>
+
+<p>Accessor for &quot;snapshot&quot; member var.</p>
+
+<h2 id="get_segment">get_segment()</h2>
+
+<p>Accessor for &quot;segment&quot; member var.</p>
+
+<h2 id="get_polyreader">get_polyreader()</h2>
+
+<p>Accessor for &quot;polyreader&quot; member var.</p>
+
+<h2 id="get_schema">get_schema()</h2>
+
+<p>Accessor for &quot;schema&quot; member var.</p>
+
+<h2 id="get_folder">get_folder()</h2>
+
+<p>Accessor for &quot;folder&quot; member var.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DataWriter isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DeletionsWriter.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DeletionsWriter.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DeletionsWriter.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,79 @@
+
+<html>
+<head>
+<title>Lucy::Index::DeletionsWriter - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DeletionsWriter - Abstract base class for marking documents as 
deleted.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $polyreader  = $del_writer-&gt;get_polyreader;
+    my $seg_readers = $polyreader-&gt;seg_readers;
+    for my $seg_reader (@$seg_readers) {
+        my $count = $del_writer-&gt;seg_del_count( 
$seg_reader-&gt;get_seg_name );
+        ...
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>Subclasses of DeletionsWriter provide a low-level mechanism for declaring a 
document deleted from an index.</p>
+
+<p>Because files in an index are never modified, and because it is not 
practical to delete entire segments, a DeletionsWriter does not actually remove 
documents from the index. Instead, it communicates to a search-time companion 
DeletionsReader which documents are deleted in such a way that it can create a 
Matcher iterator.</p>
+
+<p>Documents are truly deleted only when the segments which contain them are 
merged into new ones.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="delete_by_term-labeled-params">delete_by_term( <i>[labeled params]</i> 
)</h2>
+
+<p>Delete all documents in the index that index the supplied term.</p>
+
+<ul>
+
+<li><p><b>field</b> - The name of an indexed field. (If it is not spec&#39;d 
as <code>indexed</code>, an error will occur.)</p>
+
+</li>
+<li><p><b>term</b> - The term which identifies docs to be marked as deleted. 
If <code>field</code> is associated with an Analyzer, <code>term</code> will be 
processed automatically (so don&#39;t pre-process it yourself).</p>
+
+</li>
+</ul>
+
+<h2 id="delete_by_query-query">delete_by_query(query)</h2>
+
+<p>Delete all documents in the index that match <code>query</code>.</p>
+
+<ul>
+
+<li><p><b>query</b> - A <a href="../../Lucy/Search/Query.html">Query</a>.</p>
+
+</li>
+</ul>
+
+<h2 id="updated">updated()</h2>
+
+<p>Returns true if there are updates that need to be written.</p>
+
+<h2 id="seg_del_count-seg_name">seg_del_count(seg_name)</h2>
+
+<p>Return the number of deletions for a given segment.</p>
+
+<ul>
+
+<li><p><b>seg_name</b> - The name of the segment.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DeletionsWriter isa <a 
href="../../Lucy/Index/DataWriter.html">Lucy::Index::DataWriter</a> isa 
Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DocReader.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DocReader.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/DocReader.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,53 @@
+
+<html>
+<head>
+<title>Lucy::Index::DocReader - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::DocReader - Retrieve stored documents.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $doc_reader = 
$seg_reader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+    my $doc        = $doc_reader-&gt;fetch_doc($doc_id);</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>DocReader defines the interface by which documents (with all stored fields) 
are retrieved from the index. The default implementation returns <a 
href="../../Lucy/Document/HitDoc.html">HitDoc</a> objects.</p>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="fetch_doc-doc_id">fetch_doc(doc_id)</h2>
+
+<p>Retrieve the document identified by <code>doc_id</code>.</p>
+
+<p>Returns: a HitDoc.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="aggregator-labeled-params">aggregator( <i>[labeled params]</i> )</h2>
+
+<p>Returns a DocReader which divvies up requests to its sub-readers according 
to the offset range.</p>
+
+<ul>
+
+<li><p><b>readers</b> - An array of DocReaders.</p>
+
+</li>
+<li><p><b>offsets</b> - Doc id start offsets for each reader.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::DocReader isa <a 
href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa 
Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexManager.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexManager.html
 (added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexManager.html
 Mon Apr  4 09:23:00 2016
@@ -0,0 +1,119 @@
+
+<html>
+<head>
+<title>Lucy::Index::IndexManager - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::IndexManager - Policies governing index updating, locking, and 
file deletion.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    use Sys::Hostname qw( hostname );
+    my $hostname = hostname() or die &quot;Can&#39;t get unique hostname&quot;;
+    my $manager = Lucy::Index::IndexManager-&gt;new( 
+        host =&gt; $hostname,
+    );
+
+    # Index time:
+    my $indexer = Lucy::Index::Indexer-&gt;new(
+        index =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+
+    # Search time:
+    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index   =&gt; &#39;/path/to/index&#39;,
+        manager =&gt; $manager,
+    );
+    my $searcher = Lucy::Search::IndexSearcher-&gt;new( index =&gt; $reader 
);</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>IndexManager is an advanced-use class for controlling index locking, 
updating, merging, and deletion behaviors.</p>
+
+<p>IndexManager and <a 
href="../../Lucy/Plan/Architecture.html">Architecture</a> are complementary 
classes: Architecture is used to define traits and behaviors which cannot 
change for the life of an index; IndexManager is used for defining rules which 
may change from process to process.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="new-labeled-params">new( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $manager = Lucy::Index::IndexManager-&gt;new(
+        host =&gt; $hostname,    # default: &quot;&quot;
+    );</code></pre>
+
+<ul>
+
+<li><p><b>host</b> - An identifier which should be unique per-machine.</p>
+
+</li>
+<li><p><b>lock_factory</b> - A LockFactory.</p>
+
+</li>
+</ul>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="make_write_lock">make_write_lock()</h2>
+
+<p>Create the Lock which controls access to modifying the logical content of 
the index.</p>
+
+<h2 id="recycle-labeled-params">recycle( <i>[labeled params]</i> )</h2>
+
+<p>Return an array of SegReaders representing segments that should be 
consolidated. Implementations must balance index-time churn against search-time 
degradation due to segment proliferation. The default implementation prefers 
small segments or segments with a high proportion of deletions.</p>
+
+<ul>
+
+<li><p><b>reader</b> - A PolyReader.</p>
+
+</li>
+<li><p><b>del_writer</b> - A DeletionsWriter.</p>
+
+</li>
+<li><p><b>cutoff</b> - A segment number which all returned SegReaders must 
exceed.</p>
+
+</li>
+<li><p><b>optimize</b> - A boolean indicating whether to spend extra time 
optimizing the index for search-time performance.</p>
+
+</li>
+</ul>
+
+<h2 id="set_folder-folder">set_folder(folder)</h2>
+
+<p>Setter for <code>folder</code> member. Typical clients (Indexer, 
IndexReader) will use this method to install their own Folder instance.</p>
+
+<h2 id="get_folder">get_folder()</h2>
+
+<p>Getter for <code>folder</code> member.</p>
+
+<h2 id="get_host">get_host()</h2>
+
+<p>Getter for <code>host</code> member.</p>
+
+<h2 id="set_write_lock_timeout-timeout">set_write_lock_timeout(timeout)</h2>
+
+<p>Setter for write lock timeout. Default: 1000 milliseconds.</p>
+
+<h2 id="get_write_lock_timeout">get_write_lock_timeout()</h2>
+
+<p>Getter for write lock timeout.</p>
+
+<h2 id="set_write_lock_interval-timeout">set_write_lock_interval(timeout)</h2>
+
+<p>Setter for write lock retry interval. Default: 100 milliseconds.</p>
+
+<h2 id="get_write_lock_interval">get_write_lock_interval()</h2>
+
+<p>Getter for write lock retry interval.</p>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::IndexManager isa Clownfish::Obj.</p>
+
+</body>
+</html>
+

Added: 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexReader.html
==============================================================================
--- 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexReader.html 
(added)
+++ 
websites/staging/lucy/trunk/content/docs/0.4.0/perl/Lucy/Index/IndexReader.html 
Mon Apr  4 09:23:00 2016
@@ -0,0 +1,116 @@
+
+<html>
+<head>
+<title>Lucy::Index::IndexReader - Apache Lucy Perl Documentation</title>
+<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
+</head>
+<body>
+
+
+<h1 id="NAME">NAME</h1>
+
+<p>Lucy::Index::IndexReader - Read from an inverted index.</p>
+
+<h1 id="SYNOPSIS">SYNOPSIS</h1>
+
+<pre><code>    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index =&gt; &#39;/path/to/index&#39;,
+    );
+    my $seg_readers = $reader-&gt;seg_readers;
+    for my $seg_reader (@$seg_readers) {
+        my $seg_name = $seg_reader-&gt;get_segment-&gt;get_name;
+        my $num_docs = $seg_reader-&gt;doc_max;
+        print &quot;Segment $seg_name ($num_docs documents):\n&quot;;
+        my $doc_reader = 
$seg_reader-&gt;obtain(&quot;Lucy::Index::DocReader&quot;);
+        for my $doc_id ( 1 .. $num_docs ) {
+            my $doc = $doc_reader-&gt;fetch_doc($doc_id);
+            print &quot;  $doc_id: $doc-&gt;{title}\n&quot;;
+        }
+    }</code></pre>
+
+<h1 id="DESCRIPTION">DESCRIPTION</h1>
+
+<p>IndexReader is the interface through which <a 
href="../../Lucy/Search/IndexSearcher.html">IndexSearcher</a> objects access 
the content of an index.</p>
+
+<p>IndexReader objects always represent a point-in-time view of an index as it 
existed at the moment the reader was created. If you want search results to 
reflect modifications to an index, you must create a new IndexReader after the 
update process completes.</p>
+
+<p>IndexReaders are composites; most of the work is done by individual <a 
href="../../Lucy/Index/DataReader.html">DataReader</a> sub-components, which 
may be accessed via fetch() and obtain(). The most efficient and powerful 
access to index data happens at the segment level via <a 
href="../../Lucy/Index/SegReader.html">SegReader</a>&#39;s sub-components.</p>
+
+<h1 id="CONSTRUCTORS">CONSTRUCTORS</h1>
+
+<h2 id="open-labeled-params">open( <i>[labeled params]</i> )</h2>
+
+<pre><code>    my $reader = Lucy::Index::IndexReader-&gt;open(
+        index    =&gt; &#39;/path/to/index&#39;, # required
+        snapshot =&gt; $snapshot,
+        manager  =&gt; $index_manager,
+    );</code></pre>
+
+<p>IndexReader is an abstract base class; open() returns the IndexReader 
subclass PolyReader, which channels the output of 0 or more SegReaders.</p>
+
+<ul>
+
+<li><p><b>index</b> - Either a string filepath or a Folder.</p>
+
+</li>
+<li><p><b>snapshot</b> - A Snapshot. If not supplied, the most recent snapshot 
file will be used.</p>
+
+</li>
+<li><p><b>manager</b> - An <a 
href="../../Lucy/Index/IndexManager.html">IndexManager</a>. Read-locking is off 
by default; supplying this argument turns it on.</p>
+
+</li>
+</ul>
+
+<h1 id="ABSTRACT-METHODS">ABSTRACT METHODS</h1>
+
+<h2 id="doc_max">doc_max()</h2>
+
+<p>Return the maximum number of documents available to the reader, which is 
also the highest possible internal document id. Documents which have been 
marked as deleted but not yet purged from the index are included in this 
count.</p>
+
+<h2 id="doc_count">doc_count()</h2>
+
+<p>Return the number of documents available to the reader, subtracting any 
that are marked as deleted.</p>
+
+<h2 id="del_count">del_count()</h2>
+
+<p>Return the number of documents which have been marked as deleted but not 
yet purged from the index.</p>
+
+<h2 id="seg_readers">seg_readers()</h2>
+
+<p>Return an array of all the SegReaders represented within the 
IndexReader.</p>
+
+<h2 id="offsets">offsets()</h2>
+
+<p>Return an array with one entry for each segment, corresponding to segment 
doc_id start offset.</p>
+
+<h1 id="METHODS">METHODS</h1>
+
+<h2 id="fetch-api">fetch(api)</h2>
+
+<p>Fetch a component, or return undef if the component can&#39;t be found.</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataReader subclass that the desired 
component must implement.</p>
+
+</li>
+</ul>
+
+<h2 id="obtain-api">obtain(api)</h2>
+
+<p>Fetch a component, or throw an error if the component can&#39;t be 
found.</p>
+
+<ul>
+
+<li><p><b>api</b> - The name of the DataReader subclass that the desired 
component must implement.</p>
+
+</li>
+</ul>
+
+<h1 id="INHERITANCE">INHERITANCE</h1>
+
+<p>Lucy::Index::IndexReader isa <a 
href="../../Lucy/Index/DataReader.html">Lucy::Index::DataReader</a> isa 
Clownfish::Obj.</p>
+
+</body>
+</html>
+

svn commit: r984655 [2/8] - in /websites/staging/lucy/trunk/content: ./ docs/0.4.0/ docs/0.4.0/perl/ docs/0.4.0/perl/Lucy/ docs/0.4.0/perl/Lucy/Analysis/ docs/0.4.0/perl/Lucy/Docs/ docs/0.4.0/perl/Lucy/Docs/Cookbook/ docs/0.4.0/perl/Lucy/Docs/Tutorial/...

Reply via email to