Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/BackgroundMerger.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/BackgroundMerger.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/BackgroundMerger.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/BackgroundMerger.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,108 @@ +Title: Lucy::Index::BackgroundMerger â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::BackgroundMerger - Consolidate index segments in the background.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $bg_merger = Lucy::Index::BackgroundMerger->new( + index => '/path/to/index', +); +$bg_merger->commit;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Adding documents to an index is usually fast, +but every once in a while the index must be compacted and an update takes substantially longer to complete. +See <a href="../../Lucy/Docs/Cookbook/FastUpdates.html" class="podlinkpod" +>FastUpdates</a> for how to use this class to control worst-case index update performance.</p> + +<p>As with <a href="../../Lucy/Index/Indexer.html" class="podlinkpod" +>Indexer</a>, +see <a href="../../Lucy/Docs/FileLocking.html" class="podlinkpod" +>FileLocking</a> if your index is on a shared volume.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $bg_merger = Lucy::Index::BackgroundMerger->new( + index => '/path/to/index', # required + manager => $manager # default: created internally +);</pre> + +<p>Open a new BackgroundMerger.</p> + +<ul> +<li><b>index</b> - Either a string filepath or a Folder.</li> + +<li><b>manager</b> - An IndexManager. +If not supplied, +an IndexManager with a 10-second write lock timeout will be created.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="optimize" +>optimize</a></h3> + +<pre>$background_merger->optimize();</pre> + +<p>Optimize the index for search-time performance. +This may take a while, +as it can involve rewriting large amounts of data.</p> + +<h3><a class='u' +name="commit" +>commit</a></h3> + +<pre>$background_merger->commit();</pre> + +<p>Commit any changes made to the index. +Until this is called, +none of the changes made during an indexing session are permanent.</p> + +<p>Calls <a href="#prepare_commit" class="podlinkpod" +>prepare_commit()</a> implicitly if it has not already been called.</p> + +<h3><a class='u' +name="prepare_commit" +>prepare_commit</a></h3> + +<pre>$background_merger->prepare_commit();</pre> + +<p>Perform the expensive setup for <a href="#commit" class="podlinkpod" +>commit()</a> in advance, +so that <a href="#commit" class="podlinkpod" +>commit()</a> completes quickly.</p> + +<p>Towards the end of <a href="#prepare_commit" class="podlinkpod" +>prepare_commit()</a>, +the BackgroundMerger attempts to re-acquire the write lock, +which is then held until <a href="#commit" class="podlinkpod" +>commit()</a> finishes and releases it.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::BackgroundMerger isa Clownfish::Obj.</p> + +</div>
Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,138 @@ +Title: Lucy::Index::DataReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::DataReader - Abstract base class for reading index data.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Abstract base class.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>DataReader is the companion class to <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>DataWriter</a>. +Every index component will implement one of each.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $reader = MyDataReader->new( + schema => $seg_reader->get_schema, # default undef + folder => $seg_reader->get_folder, # default undef + snapshot => $seg_reader->get_snapshot, # default undef + segments => $seg_reader->get_segments, # default undef + seg_tick => $seg_reader->get_seg_tick, # default -1 +);</pre> + +<p>Abstract constructor.</p> + +<ul> +<li><b>schema</b> - A Schema.</li> + +<li><b>folder</b> - A Folder.</li> + +<li><b>snapshot</b> - A Snapshot.</li> + +<li><b>segments</b> - An array of Segments.</li> + +<li><b>seg_tick</b> - The array index of the Segment object within the <code>segments</code> array that this particular DataReader is assigned to, +if any. +A value of -1 indicates that no Segment should be assigned.</li> +</ul> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $data_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Create a reader which aggregates the output of several lower level readers. +Return undef if such a reader is not valid.</p> + +<ul> +<li><b>readers</b> - An array of DataReaders.</li> + +<li><b>offsets</b> - Doc id start offsets for each reader.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_schema" +>get_schema</a></h3> + +<pre>my $schema = $data_reader->get_schema();</pre> + +<p>Accessor for “schema” member var.</p> + +<h3><a class='u' +name="get_folder" +>get_folder</a></h3> + +<pre>my $folder = $data_reader->get_folder();</pre> + +<p>Accessor for “folder” member var.</p> + +<h3><a class='u' +name="get_snapshot" +>get_snapshot</a></h3> + +<pre>my $snapshot = $data_reader->get_snapshot();</pre> + +<p>Accessor for “snapshot” member var.</p> + +<h3><a class='u' +name="get_segments" +>get_segments</a></h3> + +<pre>my $arrayref = $data_reader->get_segments();</pre> + +<p>Accessor for “segments” member var.</p> + +<h3><a class='u' +name="get_segment" +>get_segment</a></h3> + +<pre>my $segment = $data_reader->get_segment();</pre> + +<p>Accessor for “segment” member var.</p> + +<h3><a class='u' +name="get_seg_tick" +>get_seg_tick</a></h3> + +<pre>my $int = $data_reader->get_seg_tick();</pre> + +<p>Accessor for “seg_tick” member var.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::DataReader isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataWriter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataWriter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DataWriter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,202 @@ +Title: Lucy::Index::DataWriter â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::DataWriter - Write data to an index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Abstract base class.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>DataWriter is an abstract base class for writing index data, +generally in segment-sized chunks. +Each component of an index – e.g. +stored fields, +lexicon, +postings, +deletions – is represented by a DataWriter/<a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>DataReader</a> pair.</p> + +<p>Components may be specified per index by subclassing <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a>.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $writer = MyDataWriter->new( + snapshot => $snapshot, # required + segment => $segment, # required + polyreader => $polyreader, # required +);</pre> + +<p>Abstract constructor.</p> + +<ul> +<li><b>snapshot</b> - The Snapshot that will be committed at the end of the indexing session.</li> + +<li><b>segment</b> - The Segment in progress.</li> + +<li><b>polyreader</b> - A PolyReader representing all existing data in the index. +(If the index is brand new, +the PolyReader will have no sub-readers).</li> +</ul> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="add_segment" +>add_segment</a></h3> + +<pre>$data_writer->add_segment( + reader => $reader # required + doc_map => $doc_map # default: undef +);</pre> + +<p>Add content from an existing segment into the one currently being written.</p> + +<ul> +<li><b>reader</b> - The SegReader containing content to add.</li> + +<li><b>doc_map</b> - An array of integers mapping old document ids to new. +Deleted documents are mapped to 0, +indicating that they should be skipped.</li> +</ul> + +<h3><a class='u' +name="finish" +>finish</a></h3> + +<pre>$data_writer->finish();</pre> + +<p>Complete the segment: close all streams, +store metadata, +etc.</p> + +<h3><a class='u' +name="format" +>format</a></h3> + +<pre>my $int = $data_writer->format();</pre> + +<p>Every writer must specify a file format revision number, +which should increment each time the format changes. +Responsibility for revision checking is left to the companion DataReader.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="delete_segment" +>delete_segment</a></h3> + +<pre>$data_writer->delete_segment($reader);</pre> + +<p>Remove a segment’s data. +The default implementation is a no-op, +as all files within the segment directory will be automatically deleted. +Subclasses which manage their own files outside of the segment system should override this method and use it as a trigger for cleaning up obsolete data.</p> + +<ul> +<li><b>reader</b> - The SegReader containing content to merge, +which must represent a segment which is part of the the current snapshot.</li> +</ul> + +<h3><a class='u' +name="merge_segment" +>merge_segment</a></h3> + +<pre>$data_writer->merge_segment( + reader => $reader # required + doc_map => $doc_map # default: undef +);</pre> + +<p>Move content from an existing segment into the one currently being written.</p> + +<p>The default implementation calls <a href="#add_segment" class="podlinkpod" +>add_segment()</a> then <a href="#delete_segment" class="podlinkpod" +>delete_segment()</a>.</p> + +<ul> +<li><b>reader</b> - The SegReader containing content to merge, +which must represent a segment which is part of the the current snapshot.</li> + +<li><b>doc_map</b> - An array of integers mapping old document ids to new. +Deleted documents are mapped to 0, +indicating that they should be skipped.</li> +</ul> + +<h3><a class='u' +name="metadata" +>metadata</a></h3> + +<pre>my $hashref = $data_writer->metadata();</pre> + +<p>Arbitrary metadata to be serialized and stored by the Segment. +The default implementation supplies a hash with a single key-value pair for “format”.</p> + +<h3><a class='u' +name="get_snapshot" +>get_snapshot</a></h3> + +<pre>my $snapshot = $data_writer->get_snapshot();</pre> + +<p>Accessor for “snapshot” member var.</p> + +<h3><a class='u' +name="get_segment" +>get_segment</a></h3> + +<pre>my $segment = $data_writer->get_segment();</pre> + +<p>Accessor for “segment” member var.</p> + +<h3><a class='u' +name="get_polyreader" +>get_polyreader</a></h3> + +<pre>my $poly_reader = $data_writer->get_polyreader();</pre> + +<p>Accessor for “polyreader” member var.</p> + +<h3><a class='u' +name="get_schema" +>get_schema</a></h3> + +<pre>my $schema = $data_writer->get_schema();</pre> + +<p>Accessor for “schema” member var.</p> + +<h3><a class='u' +name="get_folder" +>get_folder</a></h3> + +<pre>my $folder = $data_writer->get_folder();</pre> + +<p>Accessor for “folder” member var.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::DataWriter isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DeletionsWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DeletionsWriter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DeletionsWriter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DeletionsWriter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,102 @@ +Title: Lucy::Index::DeletionsWriter â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::DeletionsWriter - Abstract base class for marking documents as deleted.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $polyreader = $del_writer->get_polyreader; +my $seg_readers = $polyreader->seg_readers; +for my $seg_reader (@$seg_readers) { + my $count = $del_writer->seg_del_count( $seg_reader->get_seg_name ); + ... +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Subclasses of DeletionsWriter provide a low-level mechanism for declaring a document deleted from an index.</p> + +<p>Because files in an index are never modified, +and because it is not practical to delete entire segments, +a DeletionsWriter does not actually remove documents from the index. +Instead, +it communicates to a search-time companion DeletionsReader which documents are deleted in such a way that it can create a Matcher iterator.</p> + +<p>Documents are truly deleted only when the segments which contain them are merged into new ones.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="delete_by_term" +>delete_by_term</a></h3> + +<pre>$deletions_writer->delete_by_term( + field => $field # required + term => $term # required +);</pre> + +<p>Delete all documents in the index that index the supplied term.</p> + +<ul> +<li><b>field</b> - The name of an indexed field. +(If it is not spec’d as <code>indexed</code>, +an error will occur.)</li> + +<li><b>term</b> - The term which identifies docs to be marked as deleted. +If <code>field</code> is associated with an Analyzer, +<code>term</code> will be processed automatically (so don’t pre-process it yourself).</li> +</ul> + +<h3><a class='u' +name="delete_by_query" +>delete_by_query</a></h3> + +<pre>$deletions_writer->delete_by_query($query);</pre> + +<p>Delete all documents in the index that match <code>query</code>.</p> + +<ul> +<li><b>query</b> - A <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a>.</li> +</ul> + +<h3><a class='u' +name="updated" +>updated</a></h3> + +<pre>my $bool = $deletions_writer->updated();</pre> + +<p>Returns true if there are updates that need to be written.</p> + +<h3><a class='u' +name="seg_del_count" +>seg_del_count</a></h3> + +<pre>my $int = $deletions_writer->seg_del_count($seg_name);</pre> + +<p>Return the number of deletions for a given segment.</p> + +<ul> +<li><b>seg_name</b> - The name of the segment.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::DeletionsWriter isa <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>Lucy::Index::DataWriter</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DocReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DocReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DocReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/DocReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,69 @@ +Title: Lucy::Index::DocReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::DocReader - Retrieve stored documents.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $doc_reader = $seg_reader->obtain("Lucy::Index::DocReader"); +my $doc = $doc_reader->fetch_doc($doc_id);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>DocReader defines the interface by which documents (with all stored fields) are retrieved from the index. +The default implementation returns <a href="../../Lucy/Document/HitDoc.html" class="podlinkpod" +>HitDoc</a> objects.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="fetch_doc" +>fetch_doc</a></h3> + +<pre>my $hit_doc = $doc_reader->fetch_doc($doc_id);</pre> + +<p>Retrieve the document identified by <code>doc_id</code>.</p> + +<p>Returns: a HitDoc.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $doc_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Returns a DocReader which divvies up requests to its sub-readers according to the offset range.</p> + +<ul> +<li><b>readers</b> - An array of DocReaders.</li> + +<li><b>offsets</b> - Doc id start offsets for each reader.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::DocReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexManager.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexManager.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexManager.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexManager.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,173 @@ +Title: Lucy::Index::IndexManager â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::IndexManager - Policies governing index updating, +locking, +and file deletion.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>use Sys::Hostname qw( hostname ); +my $hostname = hostname() or die "Can't get unique hostname"; +my $manager = Lucy::Index::IndexManager->new( + host => $hostname, +); + +# Index time: +my $indexer = Lucy::Index::Indexer->new( + index => '/path/to/index', + manager => $manager, +); + +# Search time: +my $reader = Lucy::Index::IndexReader->open( + index => '/path/to/index', + manager => $manager, +); +my $searcher = Lucy::Search::IndexSearcher->new( index => $reader );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>IndexManager is an advanced-use class for controlling index locking, +updating, +merging, +and deletion behaviors.</p> + +<p>IndexManager and <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a> are complementary classes: Architecture is used to define traits and behaviors which cannot change for the life of an index; IndexManager is used for defining rules which may change from process to process.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $manager = Lucy::Index::IndexManager->new( + host => $hostname, # default: "" +);</pre> + +<p>Create a new IndexManager.</p> + +<ul> +<li><b>host</b> - An identifier which should be unique per-machine.</li> + +<li><b>lock_factory</b> - A LockFactory.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="set_folder" +>set_folder</a></h3> + +<pre>$index_manager->set_folder($folder); +$index_manager->set_folder(); # default: undef</pre> + +<p>Setter for <code>folder</code> member. +Typical clients (Indexer, +IndexReader) will use this method to install their own Folder instance.</p> + +<h3><a class='u' +name="get_folder" +>get_folder</a></h3> + +<pre>my $folder = $index_manager->get_folder();</pre> + +<p>Getter for <code>folder</code> member.</p> + +<h3><a class='u' +name="get_host" +>get_host</a></h3> + +<pre>my $string = $index_manager->get_host();</pre> + +<p>Getter for <code>host</code> member.</p> + +<h3><a class='u' +name="recycle" +>recycle</a></h3> + +<pre>my $arrayref = $index_manager->recycle( + reader => $reader # required + del_writer => $del_writer # required + cutoff => $cutoff # required + optimize => $optimize # default: false +);</pre> + +<p>Return an array of SegReaders representing segments that should be consolidated. +Implementations must balance index-time churn against search-time degradation due to segment proliferation. +The default implementation prefers small segments or segments with a high proportion of deletions.</p> + +<ul> +<li><b>reader</b> - A PolyReader.</li> + +<li><b>del_writer</b> - A DeletionsWriter.</li> + +<li><b>cutoff</b> - A segment number which all returned SegReaders must exceed.</li> + +<li><b>optimize</b> - A boolean indicating whether to spend extra time optimizing the index for search-time performance.</li> +</ul> + +<h3><a class='u' +name="make_write_lock" +>make_write_lock</a></h3> + +<pre>my $lock = $index_manager->make_write_lock();</pre> + +<p>Create the Lock which controls access to modifying the logical content of the index.</p> + +<h3><a class='u' +name="set_write_lock_timeout" +>set_write_lock_timeout</a></h3> + +<pre>$index_manager->set_write_lock_timeout($timeout);</pre> + +<p>Setter for write lock timeout. +Default: 1000 milliseconds.</p> + +<h3><a class='u' +name="get_write_lock_timeout" +>get_write_lock_timeout</a></h3> + +<pre>my $int = $index_manager->get_write_lock_timeout();</pre> + +<p>Getter for write lock timeout.</p> + +<h3><a class='u' +name="set_write_lock_interval" +>set_write_lock_interval</a></h3> + +<pre>$index_manager->set_write_lock_interval($timeout);</pre> + +<p>Setter for write lock retry interval. +Default: 100 milliseconds.</p> + +<h3><a class='u' +name="get_write_lock_interval" +>get_write_lock_interval</a></h3> + +<pre>my $int = $index_manager->get_write_lock_interval();</pre> + +<p>Getter for write lock retry interval.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::IndexManager isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/IndexReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,164 @@ +Title: Lucy::Index::IndexReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::IndexReader - Read from an inverted index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $reader = Lucy::Index::IndexReader->open( + index => '/path/to/index', +); +my $seg_readers = $reader->seg_readers; +for my $seg_reader (@$seg_readers) { + my $seg_name = $seg_reader->get_segment->get_name; + my $num_docs = $seg_reader->doc_max; + print "Segment $seg_name ($num_docs documents):\n"; + my $doc_reader = $seg_reader->obtain("Lucy::Index::DocReader"); + for my $doc_id ( 1 .. $num_docs ) { + my $doc = $doc_reader->fetch_doc($doc_id); + print " $doc_id: $doc->{title}\n"; + } +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>IndexReader is the interface through which <a href="../../Lucy/Search/IndexSearcher.html" class="podlinkpod" +>IndexSearcher</a> objects access the content of an index.</p> + +<p>IndexReader objects always represent a point-in-time view of an index as it existed at the moment the reader was created. +If you want search results to reflect modifications to an index, +you must create a new IndexReader after the update process completes.</p> + +<p>IndexReaders are composites; most of the work is done by individual <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>DataReader</a> sub-components, +which may be accessed via <a href="#fetch" class="podlinkpod" +>fetch()</a> and <a href="#obtain" class="podlinkpod" +>obtain()</a>. +The most efficient and powerful access to index data happens at the segment level via <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a>’s sub-components.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="open" +>open</a></h3> + +<pre>my $reader = Lucy::Index::IndexReader->open( + index => '/path/to/index', # required + snapshot => $snapshot, + manager => $index_manager, +);</pre> + +<p>IndexReader is an abstract base class; open() returns the IndexReader subclass PolyReader, +which channels the output of 0 or more SegReaders.</p> + +<ul> +<li><b>index</b> - Either a string filepath or a Folder.</li> + +<li><b>snapshot</b> - A Snapshot. +If not supplied, +the most recent snapshot file will be used.</li> + +<li><b>manager</b> - An <a href="../../Lucy/Index/IndexManager.html" class="podlinkpod" +>IndexManager</a>. +Read-locking is off by default; supplying this argument turns it on.</li> +</ul> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $index_reader->doc_max();</pre> + +<p>Return the maximum number of documents available to the reader, +which is also the highest possible internal document id. +Documents which have been marked as deleted but not yet purged from the index are included in this count.</p> + +<h3><a class='u' +name="doc_count" +>doc_count</a></h3> + +<pre>my $int = $index_reader->doc_count();</pre> + +<p>Return the number of documents available to the reader, +subtracting any that are marked as deleted.</p> + +<h3><a class='u' +name="del_count" +>del_count</a></h3> + +<pre>my $int = $index_reader->del_count();</pre> + +<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p> + +<h3><a class='u' +name="offsets" +>offsets</a></h3> + +<pre>my $i32_array = $index_reader->offsets();</pre> + +<p>Return an array with one entry for each segment, +corresponding to segment doc_id start offset.</p> + +<h3><a class='u' +name="seg_readers" +>seg_readers</a></h3> + +<pre>my $arrayref = $index_reader->seg_readers();</pre> + +<p>Return an array of all the SegReaders represented within the IndexReader.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="obtain" +>obtain</a></h3> + +<pre>my $data_reader = $index_reader->obtain($api);</pre> + +<p>Fetch a component, +or throw an error if the component can’t be found.</p> + +<ul> +<li><b>api</b> - The name of the DataReader subclass that the desired component must implement.</li> +</ul> + +<h3><a class='u' +name="fetch" +>fetch</a></h3> + +<pre>my $data_reader = $index_reader->fetch($api);</pre> + +<p>Fetch a component, +or return undef if the component can’t be found.</p> + +<ul> +<li><b>api</b> - The name of the DataReader subclass that the desired component must implement.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::IndexReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Indexer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Indexer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Indexer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Indexer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,253 @@ +Title: Lucy::Index::Indexer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Indexer - Build inverted indexes.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $indexer = Lucy::Index::Indexer->new( + schema => $schema, + index => '/path/to/index', + create => 1, +); +while ( my ( $title, $content ) = each %source_docs ) { + $indexer->add_doc({ + title => $title, + content => $content, + }); +} +$indexer->commit;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>The Indexer class is Apache Lucy’s primary tool for managing the content of inverted indexes, +which may later be searched using <a href="../../Lucy/Search/IndexSearcher.html" class="podlinkpod" +>IndexSearcher</a>.</p> + +<p>In general, +only one Indexer at a time may write to an index safely. +If a write lock cannot be secured, +new() will throw an exception.</p> + +<p>If an index is located on a shared volume, +each writer application must identify itself by supplying an <a href="../../Lucy/Index/IndexManager.html" class="podlinkpod" +>IndexManager</a> with a unique <code>host</code> id to Indexer’s constructor or index corruption will occur. +See <a href="../../Lucy/Docs/FileLocking.html" class="podlinkpod" +>FileLocking</a> for a detailed discussion.</p> + +<p>Note: at present, +<a href="#delete_by_term" class="podlinkpod" +>delete_by_term()</a> and <a href="#delete_by_query" class="podlinkpod" +>delete_by_query()</a> only affect documents which had been previously committed to the index – and not any documents added this indexing session but not yet committed. +This may change in a future update.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $indexer = Lucy::Index::Indexer->new( + schema => $schema, # required at index creation + index => '/path/to/index', # required + create => 1, # default: 0 + truncate => 1, # default: 0 + manager => $manager # default: created internally +);</pre> + +<ul> +<li><b>schema</b> - A Schema. +Required when index is being created; if not supplied, +will be extracted from the index folder.</li> + +<li><b>index</b> - Either a filepath to an index or a Folder.</li> + +<li><b>create</b> - If true and the index directory does not exist, +attempt to create it.</li> + +<li><b>truncate</b> - If true, +proceed with the intention of discarding all previous indexing data. +The old data will remain intact and visible until commit() succeeds.</li> + +<li><b>manager</b> - An IndexManager.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="add_doc" +>add_doc</a></h3> + +<pre>$indexer->add_doc($doc); +$indexer->add_doc( { field_name => $field_value } ); +$indexer->add_doc( + doc => { field_name => $field_value }, + boost => 2.5, # default: 1.0 +);</pre> + +<p>Add a document to the index. +Accepts either a single argument or labeled params.</p> + +<ul> +<li><b>doc</b> - Either a Lucy::Document::Doc object, +or a hashref (which will be attached to a Lucy::Document::Doc object internally).</li> + +<li><b>boost</b> - A floating point weight which affects how this document scores.</li> +</ul> + +<h3><a class='u' +name="add_index" +>add_index</a></h3> + +<pre>$indexer->add_index($index);</pre> + +<p>Absorb an existing index into this one. +The two indexes must have matching Schemas.</p> + +<ul> +<li><b>index</b> - Either an index path name or a Folder.</li> +</ul> + +<h3><a class='u' +name="delete_by_term" +>delete_by_term</a></h3> + +<pre>$indexer->delete_by_term( + field => $field # required + term => $term # required +);</pre> + +<p>Mark documents which contain the supplied term as deleted, +so that they will be excluded from search results and eventually removed altogether. +The change is not apparent to search apps until after <a href="#commit" class="podlinkpod" +>commit()</a> succeeds.</p> + +<ul> +<li><b>field</b> - The name of an indexed field. +(If it is not spec’d as <code>indexed</code>, +an error will occur.)</li> + +<li><b>term</b> - The term which identifies docs to be marked as deleted. +If <code>field</code> is associated with an Analyzer, +<code>term</code> will be processed automatically (so don’t pre-process it yourself).</li> +</ul> + +<h3><a class='u' +name="delete_by_query" +>delete_by_query</a></h3> + +<pre>$indexer->delete_by_query($query);</pre> + +<p>Mark documents which match the supplied Query as deleted.</p> + +<ul> +<li><b>query</b> - A <a href="../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a>.</li> +</ul> + +<h3><a class='u' +name="delete_by_doc_id" +>delete_by_doc_id</a></h3> + +<pre>$indexer->delete_by_doc_id($doc_id);</pre> + +<p>Mark the document identified by the supplied document ID as deleted.</p> + +<ul> +<li><b>doc_id</b> - A <a href="../../Lucy/Docs/DocIDs.html" class="podlinkpod" +>document id</a>.</li> +</ul> + +<h3><a class='u' +name="optimize" +>optimize</a></h3> + +<pre>$indexer->optimize();</pre> + +<p>Optimize the index for search-time performance. +This may take a while, +as it can involve rewriting large amounts of data.</p> + +<p>Every Indexer session which changes index content and ends in a <a href="#commit" class="podlinkpod" +>commit()</a> creates a new segment. +Once written, +segments are never modified. +However, +they are periodically recycled by feeding their content into the segment currently being written.</p> + +<p>The <a href="#optimize" class="podlinkpod" +>optimize()</a> method causes all existing index content to be fed back into the Indexer. +When <a href="#commit" class="podlinkpod" +>commit()</a> completes after an <a href="#optimize" class="podlinkpod" +>optimize()</a>, +the index will consist of one segment. +So <a href="#optimize" class="podlinkpod" +>optimize()</a> must be called before <a href="#commit" class="podlinkpod" +>commit()</a>. +Also, +optimizing a fresh index created from scratch has no effect.</p> + +<p>Historically, +there was a significant search-time performance benefit to collapsing down to a single segment versus even two segments. +Now the effect of collapsing is much less significant, +and calling <a href="#optimize" class="podlinkpod" +>optimize()</a> is rarely justified.</p> + +<h3><a class='u' +name="commit" +>commit</a></h3> + +<pre>$indexer->commit();</pre> + +<p>Commit any changes made to the index. +Until this is called, +none of the changes made during an indexing session are permanent.</p> + +<p>Calling <a href="#commit" class="podlinkpod" +>commit()</a> invalidates the Indexer, +so if you want to make more changes you’ll need a new one.</p> + +<h3><a class='u' +name="prepare_commit" +>prepare_commit</a></h3> + +<pre>$indexer->prepare_commit();</pre> + +<p>Perform the expensive setup for <a href="#commit" class="podlinkpod" +>commit()</a> in advance, +so that <a href="#commit" class="podlinkpod" +>commit()</a> completes quickly. +(If <a href="#prepare_commit" class="podlinkpod" +>prepare_commit()</a> is not called explicitly by the user, +<a href="#commit" class="podlinkpod" +>commit()</a> will call it internally.)</p> + +<h3><a class='u' +name="get_schema" +>get_schema</a></h3> + +<pre>my $schema = $indexer->get_schema();</pre> + +<p>Accessor for schema.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Indexer isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Lexicon.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Lexicon.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Lexicon.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Lexicon.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::Lexicon â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Lexicon - Iterator for a field’s terms.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $lex_reader = $seg_reader->obtain('Lucy::Index::LexiconReader'); +my $lexicon = $lex_reader->lexicon( field => 'content' ); +while ( $lexicon->next ) { + print $lexicon->get_term . "\n"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Lexicon is an iterator which provides access to all the unique terms for a given field in sorted order.</p> + +<p>If an index consists of two documents with a ‘content’ field holding “three blind mice” and “three musketeers” respectively, +then iterating through the ‘content’ field’s lexicon would produce this list:</p> + +<pre>blind +mice +musketeers +three</pre> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="seek" +>seek</a></h3> + +<pre>$lexicon->seek($target); +$lexicon->seek(); # default: undef</pre> + +<p>Seek the Lexicon to the first iterator state which is greater than or equal to <code>target</code>. +If <code>target</code> is undef, +reset the iterator.</p> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $bool = $lexicon->next();</pre> + +<p>Proceed to the next term.</p> + +<p>Returns: true until the iterator is exhausted, +then false.</p> + +<h3><a class='u' +name="reset" +>reset</a></h3> + +<pre>$lexicon->reset();</pre> + +<p>Reset the iterator. +<a href="#next" class="podlinkpod" +>next()</a> must be called to proceed to the first element.</p> + +<h3><a class='u' +name="get_term" +>get_term</a></h3> + +<pre>my $obj = $lexicon->get_term();</pre> + +<p>Return the current term, +or undef if the iterator is not in a valid state.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Lexicon isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/LexiconReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/LexiconReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/LexiconReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/LexiconReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::LexiconReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::LexiconReader - Read Lexicon data.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $lex_reader = $seg_reader->obtain("Lucy::Index::LexiconReader"); +my $lexicon = $lex_reader->lexicon( field => 'title' );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>LexiconReader reads term dictionary information.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="lexicon" +>lexicon</a></h3> + +<pre>my $lexicon = $lexicon_reader->lexicon( + field => $field # required + term => $term # default: undef +);</pre> + +<p>Return a new Lexicon for the given <code>field</code>. +Will return undef if either the field is not indexed, +or if no documents contain a value for the field.</p> + +<ul> +<li><b>field</b> - Field name.</li> + +<li><b>term</b> - Pre-locate the Lexicon to this term.</li> +</ul> + +<h3><a class='u' +name="doc_freq" +>doc_freq</a></h3> + +<pre>my $int = $lexicon_reader->doc_freq( + field => $field # required + term => $term # required +);</pre> + +<p>Return the number of documents where the specified term is present.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $lexicon_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Return a LexiconReader which merges the output of other LexiconReaders.</p> + +<ul> +<li><b>readers</b> - An array of LexiconReaders.</li> + +<li><b>offsets</b> - Doc id start offsets for each reader.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::LexiconReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PolyReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PolyReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PolyReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PolyReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,95 @@ +Title: Lucy::Index::PolyReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PolyReader - Multi-segment implementation of IndexReader.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $polyreader = Lucy::Index::IndexReader->open( + index => '/path/to/index', +); +my $doc_reader = $polyreader->obtain("Lucy::Index::DocReader"); +for my $doc_id ( 1 .. $polyreader->doc_max ) { + my $doc = $doc_reader->fetch_doc($doc_id); + print " $doc_id: $doc->{title}\n"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PolyReader conflates index data from multiple segments. +For instance, +if an index contains three segments with 10 documents each, +PolyReader’s <a href="../../Lucy/Index/IndexReader.html#doc_max" class="podlinkpod" +>doc_max()</a> method will return 30.</p> + +<p>Some of PolyReader’s <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>DataReader</a> components may be less efficient or complete than the single-segment implementations accessed via <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $poly_reader->doc_max();</pre> + +<p>Return the maximum number of documents available to the reader, +which is also the highest possible internal document id. +Documents which have been marked as deleted but not yet purged from the index are included in this count.</p> + +<h3><a class='u' +name="doc_count" +>doc_count</a></h3> + +<pre>my $int = $poly_reader->doc_count();</pre> + +<p>Return the number of documents available to the reader, +subtracting any that are marked as deleted.</p> + +<h3><a class='u' +name="del_count" +>del_count</a></h3> + +<pre>my $int = $poly_reader->del_count();</pre> + +<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p> + +<h3><a class='u' +name="offsets" +>offsets</a></h3> + +<pre>my $i32_array = $poly_reader->offsets();</pre> + +<p>Return an array with one entry for each segment, +corresponding to segment doc_id start offset.</p> + +<h3><a class='u' +name="seg_readers" +>seg_readers</a></h3> + +<pre>my $arrayref = $poly_reader->seg_readers();</pre> + +<p>Return an array of all the SegReaders represented within the IndexReader.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PolyReader isa <a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingList.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingList.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingList.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingList.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,70 @@ +Title: Lucy::Index::PostingList â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PostingList - Term-Document pairings.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $posting_list_reader + = $seg_reader->obtain("Lucy::Index::PostingListReader"); +my $posting_list = $posting_list_reader->posting_list( + field => 'content', + term => 'foo', +); +while ( my $doc_id = $posting_list->next ) { + say "Matching doc id: $doc_id"; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PostingList is an iterator which supplies a list of document ids that match a given term.</p> + +<p>See <a href="../../Lucy/Docs/IRTheory.html" class="podlinkpod" +>IRTheory</a> for definitions of “posting” and “posting list”.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="get_doc_freq" +>get_doc_freq</a></h3> + +<pre>my $int = $posting_list->get_doc_freq();</pre> + +<p>Return the number of documents that the PostingList contains. +(This number will include any documents which have been marked as deleted but not yet purged.)</p> + +<h3><a class='u' +name="seek" +>seek</a></h3> + +<pre>$posting_list->seek($target); +$posting_list->seek(); # default: undef</pre> + +<p>Prepare the PostingList object to iterate over matches for documents that match <code>target</code>.</p> + +<ul> +<li><b>target</b> - The term to match. +If undef, +the iterator will be empty.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PostingList isa <a href="../../Lucy/Search/Matcher.html" class="podlinkpod" +>Lucy::Search::Matcher</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingListReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingListReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingListReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/PostingListReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,76 @@ +Title: Lucy::Index::PostingListReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::PostingListReader - Read postings data.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $posting_list_reader + = $seg_reader->obtain("Lucy::Index::PostingListReader"); +my $posting_list = $posting_list_reader->posting_list( + field => 'title', + term => 'foo', +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>PostingListReaders produce <a href="../../Lucy/Index/PostingList.html" class="podlinkpod" +>PostingList</a> objects which convey document matching information.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="posting_list" +>posting_list</a></h3> + +<pre>my $posting_list = $posting_list_reader->posting_list( + field => $field # default: undef + term => $term # default: undef +);</pre> + +<p>Returns a PostingList, +or undef if either <code>field</code> is undef or <code>field</code> is not present in any documents.</p> + +<ul> +<li><b>field</b> - A field name.</li> + +<li><b>term</b> - If supplied, +the PostingList will be pre-located to this term using <a href="../../Lucy/Index/PostingList.html#seek" class="podlinkpod" +>seek()</a>.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="aggregator" +>aggregator</a></h3> + +<pre>my $result = $posting_list_reader->aggregator( + readers => $readers # required + offsets => $offsets # required +);</pre> + +<p>Returns undef since PostingLists may only be iterated at the segment level.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::PostingListReader isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegReader.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegReader.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegReader.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegReader.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,115 @@ +Title: Lucy::Index::SegReader â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::SegReader - Single-segment IndexReader.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $polyreader = Lucy::Index::IndexReader->open( + index => '/path/to/index', +); +my $seg_readers = $polyreader->seg_readers; +for my $seg_reader (@$seg_readers) { + my $seg_name = $seg_reader->get_seg_name; + my $num_docs = $seg_reader->doc_max; + print "Segment $seg_name ($num_docs documents):\n"; + my $doc_reader = $seg_reader->obtain("Lucy::Index::DocReader"); + for my $doc_id ( 1 .. $num_docs ) { + my $doc = $doc_reader->fetch_doc($doc_id); + print " $doc_id: $doc->{title}\n"; + } +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>SegReader interprets the data within a single segment of an index.</p> + +<p>Generally speaking, +only advanced users writing subclasses which manipulate data at the segment level need to deal with the SegReader API directly.</p> + +<p>Nearly all of SegReader’s functionality is implemented by pluggable components spawned by <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a>’s factory methods.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_seg_name" +>get_seg_name</a></h3> + +<pre>my $string = $seg_reader->get_seg_name();</pre> + +<p>Return the name of the segment.</p> + +<h3><a class='u' +name="get_seg_num" +>get_seg_num</a></h3> + +<pre>my $int = $seg_reader->get_seg_num();</pre> + +<p>Return the number of the segment.</p> + +<h3><a class='u' +name="del_count" +>del_count</a></h3> + +<pre>my $int = $seg_reader->del_count();</pre> + +<p>Return the number of documents which have been marked as deleted but not yet purged from the index.</p> + +<h3><a class='u' +name="doc_max" +>doc_max</a></h3> + +<pre>my $int = $seg_reader->doc_max();</pre> + +<p>Return the maximum number of documents available to the reader, +which is also the highest possible internal document id. +Documents which have been marked as deleted but not yet purged from the index are included in this count.</p> + +<h3><a class='u' +name="doc_count" +>doc_count</a></h3> + +<pre>my $int = $seg_reader->doc_count();</pre> + +<p>Return the number of documents available to the reader, +subtracting any that are marked as deleted.</p> + +<h3><a class='u' +name="_offsets" +>_offsets</a></h3> + +<pre>my $i32_array = $seg_reader->_offsets();</pre> + +<p>Return an array with one entry for each segment, +corresponding to segment doc_id start offset.</p> + +<h3><a class='u' +name="seg_readers" +>seg_readers</a></h3> + +<pre>my $arrayref = $seg_reader->seg_readers();</pre> + +<p>Return an array of all the SegReaders represented within the IndexReader.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::SegReader isa <a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>Lucy::Index::IndexReader</a> isa <a href="../../Lucy/Index/DataReader.html" class="podlinkpod" +>Lucy::Index::DataReader</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegWriter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegWriter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegWriter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/SegWriter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,126 @@ +Title: Lucy::Index::SegWriter â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::SegWriter - Write one segment of an index.</p> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>SegWriter is a conduit through which information fed to Indexer passes. +It manages <a href="../../Lucy/Index/Segment.html" class="podlinkpod" +>Segment</a> and Inverter, +invokes the <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzer</a> chain, +and feeds low level <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>DataWriters</a> such as PostingListWriter and DocWriter.</p> + +<p>The sub-components of a SegWriter are determined by <a href="../../Lucy/Plan/Architecture.html" class="podlinkpod" +>Architecture</a>. +DataWriter components which are added to the stack of writers via <a href="#add_writer" class="podlinkpod" +>add_writer()</a> have Add_Inverted_Doc() invoked for each document supplied to SegWriter’s <a href="#add_doc" class="podlinkpod" +>add_doc()</a>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="register" +>register</a></h3> + +<pre>$seg_writer->register( + api => $api # required + component => $component # required +);</pre> + +<p>Register a DataWriter component with the SegWriter. +(Note that registration simply makes the writer available via <a href="#fetch" class="podlinkpod" +>fetch()</a>, +so you may also want to call <a href="#add_writer" class="podlinkpod" +>add_writer()</a>).</p> + +<ul> +<li><b>api</b> - The name of the DataWriter api which <code>writer</code> implements.</li> + +<li><b>component</b> - A DataWriter.</li> +</ul> + +<h3><a class='u' +name="fetch" +>fetch</a></h3> + +<pre>my $obj = $seg_writer->fetch($api);</pre> + +<p>Retrieve a registered component.</p> + +<ul> +<li><b>api</b> - The name of the DataWriter api which the component implements.</li> +</ul> + +<h3><a class='u' +name="add_writer" +>add_writer</a></h3> + +<pre>$seg_writer->add_writer($writer);</pre> + +<p>Add a DataWriter to the SegWriter’s stack of writers.</p> + +<h3><a class='u' +name="add_doc" +>add_doc</a></h3> + +<pre>$seg_writer->add_doc( + doc => $doc # required + boost => $boost # default: 1.0 +);</pre> + +<p>Add a document to the segment. +Inverts <code>doc</code>, +increments the Segment’s internal document id, +then calls Add_Inverted_Doc(), +feeding all sub-writers.</p> + +<h3><a class='u' +name="add_segment" +>add_segment</a></h3> + +<pre>$seg_writer->add_segment( + reader => $reader # required + doc_map => $doc_map # default: undef +);</pre> + +<p>Add content from an existing segment into the one currently being written.</p> + +<ul> +<li><b>reader</b> - The SegReader containing content to add.</li> + +<li><b>doc_map</b> - An array of integers mapping old document ids to new. +Deleted documents are mapped to 0, +indicating that they should be skipped.</li> +</ul> + +<h3><a class='u' +name="finish" +>finish</a></h3> + +<pre>$seg_writer->finish();</pre> + +<p>Complete the segment: close all streams, +store metadata, +etc.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::SegWriter isa <a href="../../Lucy/Index/DataWriter.html" class="podlinkpod" +>Lucy::Index::DataWriter</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Segment.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Segment.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Segment.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Segment.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,182 @@ +Title: Lucy::Index::Segment â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Segment - Warehouse for information about one segment of an inverted index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Index-time. +package MyDataWriter; +use base qw( Lucy::Index::DataWriter ); + +sub finish { + my $self = shift; + my $segment = $self->get_segment; + my $metadata = $self->SUPER::metadata(); + $metadata->{foo} = $self->get_foo; + $segment->store_metadata( + key => 'my_component', + metadata => $metadata + ); +} + +# Search-time. +package MyDataReader; +use base qw( Lucy::Index::DataReader ); + +sub new { + my $self = shift->SUPER::new(@_); + my $segment = $self->get_segment; + my $metadata = $segment->fetch_metadata('my_component'); + if ($metadata) { + $self->set_foo( $metadata->{foo} ); + ... + } + return $self; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Apache Lucy’s indexes are made up of individual “segments”, +each of which is is an independent inverted index. +On the file system, +each segment is a directory within the main index directory whose name starts with “seg_”: “seg_2”, +“seg_5a”, +etc.</p> + +<p>Each Segment object keeps track of information about an index segment: its fields, +document count, +and so on. +The Segment object itself writes one file, +<code>segmeta.json</code>; besides storing info needed by Segment itself, +the “segmeta” file serves as a central repository for metadata generated by other index components – relieving them of the burden of storing metadata themselves.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="add_field" +>add_field</a></h3> + +<pre>my $int = $segment->add_field($field);</pre> + +<p>Register a new field and assign it a field number. +If the field was already known, +nothing happens.</p> + +<ul> +<li><b>field</b> - Field name.</li> +</ul> + +<p>Returns: the field’s field number, +which is a positive integer.</p> + +<h3><a class='u' +name="store_metadata" +>store_metadata</a></h3> + +<pre>$segment->store_metadata( + key => $key # required + metadata => $metadata # required +);</pre> + +<p>Store arbitrary information in the segment’s metadata hash, +to be serialized later. +Throws an error if <code>key</code> is used twice.</p> + +<ul> +<li><b>key</b> - String identifying an index component.</li> + +<li><b>metadata</b> - JSON-izable data structure.</li> +</ul> + +<h3><a class='u' +name="fetch_metadata" +>fetch_metadata</a></h3> + +<pre>my $obj = $segment->fetch_metadata($key);</pre> + +<p>Fetch a value from the Segment’s metadata hash.</p> + +<h3><a class='u' +name="field_num" +>field_num</a></h3> + +<pre>my $int = $segment->field_num($field);</pre> + +<p>Given a field name, +return its field number for this segment (which may differ from its number in other segments). +Return 0 (an invalid field number) if the field name can’t be found.</p> + +<ul> +<li><b>field</b> - Field name.</li> +</ul> + +<h3><a class='u' +name="field_name" +>field_name</a></h3> + +<pre>my $string = $segment->field_name($field_num);</pre> + +<p>Given a field number, +return the name of its field, +or undef if the field name can’t be found.</p> + +<h3><a class='u' +name="get_name" +>get_name</a></h3> + +<pre>my $string = $segment->get_name();</pre> + +<p>Getter for the object’s seg name.</p> + +<h3><a class='u' +name="get_number" +>get_number</a></h3> + +<pre>my $int = $segment->get_number();</pre> + +<p>Getter for the segment number.</p> + +<h3><a class='u' +name="set_count" +>set_count</a></h3> + +<pre>$segment->set_count($count);</pre> + +<p>Setter for the object’s document count.</p> + +<h3><a class='u' +name="get_count" +>get_count</a></h3> + +<pre>my $int = $segment->get_count();</pre> + +<p>Getter for the object’s document count.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $segment->compare_to($other);</pre> + +<p>Compare by segment number.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Segment isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Similarity.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Similarity.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Similarity.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Similarity.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Index::Similarity â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Similarity - Judge how well a document matches a query.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MySimilarity; + +sub length_norm { return 1.0 } # disable length normalization + +package MyFullTextType; +use base qw( Lucy::Plan::FullTextType ); + +sub make_similarity { MySimilarity->new }</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>After determining whether a document matches a given query, +a score must be calculated which indicates how <i>well</i> the document matches the query. +The Similarity class is used to judge how “similar” the query and the document are to each other; the closer the resemblance, +they higher the document scores.</p> + +<p>The default implementation uses Lucene’s modified cosine similarity measure. +Subclasses might tweak the existing algorithms, +or might be used in conjunction with custom Query subclasses to implement arbitrary scoring schemes.</p> + +<p>Most of the methods operate on single fields, +but some are used to combine scores from multiple fields.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $sim = Lucy::Index::Similarity->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="length_norm" +>length_norm</a></h3> + +<pre>my $float = $similarity->length_norm($num_tokens);</pre> + +<p>Dampen the scores of long documents.</p> + +<p>After a field is broken up into terms at index-time, +each term must be assigned a weight. +One of the factors in calculating this weight is the number of tokens that the original field was broken into.</p> + +<p>Typically, +we assume that the more tokens in a field, +the less important any one of them is – so that, +e.g. +5 mentions of “Kafka” in a short article are given more heft than 5 mentions of “Kafka” in an entire book. +The default implementation of length_norm expresses this using an inverted square root.</p> + +<p>However, +the inverted square root has a tendency to reward very short fields highly, +which isn’t always appropriate for fields you expect to have a lot of tokens on average.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Similarity isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Snapshot.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Snapshot.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Snapshot.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Index/Snapshot.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,156 @@ +Title: Lucy::Index::Snapshot â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Index::Snapshot - Point-in-time index file list.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $snapshot = Lucy::Index::Snapshot->new; +$snapshot->read_file( folder => $folder ); # load most recent snapshot +my $files = $snapshot->list; +print "$_\n" for @$files;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A Snapshot is list of index files and folders. +Because index files, +once written, +are never modified, +a Snapshot defines a point-in-time view of the data in an index.</p> + +<p><a href="../../Lucy/Index/IndexReader.html" class="podlinkpod" +>IndexReader</a> objects interpret the data associated with a single Snapshot.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $snapshot = Lucy::Index::Snapshot->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="list" +>list</a></h3> + +<pre>my $arrayref = $snapshot->list();</pre> + +<p>Return an array of all entries.</p> + +<h3><a class='u' +name="num_entries" +>num_entries</a></h3> + +<pre>my $int = $snapshot->num_entries();</pre> + +<p>Return the number of entries (including directories).</p> + +<h3><a class='u' +name="add_entry" +>add_entry</a></h3> + +<pre>$snapshot->add_entry($entry);</pre> + +<p>Add a filepath to the snapshot.</p> + +<h3><a class='u' +name="delete_entry" +>delete_entry</a></h3> + +<pre>my $bool = $snapshot->delete_entry($entry);</pre> + +<p>Delete a filepath from the snapshot.</p> + +<p>Returns: true if the entry existed and was successfully deleted, +false otherwise.</p> + +<h3><a class='u' +name="read_file" +>read_file</a></h3> + +<pre>my $result = $snapshot->read_file( + folder => $folder # required + path => $path # default: undef +);</pre> + +<p>Decode a snapshot file and initialize the object to reflect its contents.</p> + +<ul> +<li><b>folder</b> - A Folder.</li> + +<li><b>path</b> - The location of the snapshot file. +If not supplied, +the most recent snapshot file in the base directory will be chosen.</li> +</ul> + +<p>Returns: the Snapshot object itself</p> + +<h3><a class='u' +name="write_file" +>write_file</a></h3> + +<pre>$snapshot->write_file( + folder => $folder # required + path => $path # default: undef +);</pre> + +<p>Write a snapshot file. +The caller must lock the index while this operation takes place, +and the operation will fail if the snapshot file already exists.</p> + +<ul> +<li><b>folder</b> - A Folder.</li> + +<li><b>path</b> - The path of the file to write. +If undef, +a file name will be chosen which supersedes the latest snapshot file in the index folder.</li> +</ul> + +<h3><a class='u' +name="set_path" +>set_path</a></h3> + +<pre>$snapshot->set_path($path);</pre> + +<p>Set the path to the file that the Snapshot object serves as a proxy for.</p> + +<h3><a class='u' +name="get_path" +>get_path</a></h3> + +<pre>my $string = $snapshot->get_path();</pre> + +<p>Get the path to the snapshot file. +Initially undef; updated by <a href="#read_file" class="podlinkpod" +>read_file()</a>, +<a href="#write_file" class="podlinkpod" +>write_file()</a>, +and <a href="#set_path" class="podlinkpod" +>set_path()</a>.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Index::Snapshot isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/BitVector.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/BitVector.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/BitVector.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/BitVector.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,232 @@ +Title: Lucy::Object::BitVector â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Object::BitVector - An array of bits.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $bit_vec = Lucy::Object::BitVector->new( capacity => 8 ); +my $other = Lucy::Object::BitVector->new( capacity => 8 ); +$bit_vec->set($_) for ( 0, 2, 4, 6 ); +$other->set($_) for ( 1, 3, 5, 7 ); +$bit_vec->or($other); +print "$_\n" for @{ $bit_vec->to_array }; # prints 0 through 7.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>BitVector is a growable array of bits. +All bits are initially zero.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $bit_vec = Lucy::Object::BitVector->new( + capacity => $doc_max + 1, # default 0, +);</pre> + +<p>Create a new BitVector.</p> + +<ul> +<li><b>capacity</b> - The number of bits that the initial array should be able to hold.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get" +>get</a></h3> + +<pre>my $bool = $bit_vector->get($tick);</pre> + +<p>Return true if the bit at <code>tick</code> has been set, +false if it hasn’t (regardless of whether it lies within the bounds of the object’s capacity).</p> + +<ul> +<li><b>tick</b> - The requested bit.</li> +</ul> + +<h3><a class='u' +name="set" +>set</a></h3> + +<pre>$bit_vector->set($tick);</pre> + +<p>Set the bit at <code>tick</code> to 1.</p> + +<ul> +<li><b>tick</b> - The bit to be set.</li> +</ul> + +<h3><a class='u' +name="next_hit" +>next_hit</a></h3> + +<pre>my $int = $bit_vector->next_hit($tick);</pre> + +<p>Returns the next set bit equal to or greater than <code>tick</code>, +or -1 if no such bit exists.</p> + +<h3><a class='u' +name="clear" +>clear</a></h3> + +<pre>$bit_vector->clear($tick);</pre> + +<p>Clear the indicated bit. +(i.e. +set it to 0).</p> + +<ul> +<li><b>tick</b> - The bit to be cleared.</li> +</ul> + +<h3><a class='u' +name="clear_all" +>clear_all</a></h3> + +<pre>$bit_vector->clear_all();</pre> + +<p>Clear all bits.</p> + +<h3><a class='u' +name="grow" +>grow</a></h3> + +<pre>$bit_vector->grow($capacity);</pre> + +<p>If the BitVector does not already have enough room to hold the indicated number of bits, +allocate more memory so that it can.</p> + +<ul> +<li><b>capacity</b> - Least number of bits the BitVector should accomodate.</li> +</ul> + +<h3><a class='u' +name="and" +>and</a></h3> + +<pre>$bit_vector->and($other);</pre> + +<p>Modify the BitVector so that only bits which remain set are those which 1) were already set in this BitVector, +and 2) were also set in the other BitVector.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="or" +>or</a></h3> + +<pre>$bit_vector->or($other);</pre> + +<p>Modify the BitVector, +setting all bits which are set in the other BitVector if they were not already set.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="xor" +>xor</a></h3> + +<pre>$bit_vector->xor($other);</pre> + +<p>Modify the BitVector, +performing an XOR operation against the other.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="and_not" +>and_not</a></h3> + +<pre>$bit_vector->and_not($other);</pre> + +<p>Modify the BitVector, +clearing all bits which are set in the other.</p> + +<ul> +<li><b>other</b> - Another BitVector.</li> +</ul> + +<h3><a class='u' +name="flip" +>flip</a></h3> + +<pre>$bit_vector->flip($tick);</pre> + +<p>Invert the value of a bit.</p> + +<ul> +<li><b>tick</b> - The bit to invert.</li> +</ul> + +<h3><a class='u' +name="flip_block" +>flip_block</a></h3> + +<pre>$bit_vector->flip_block( + offset => $offset # required + length => $length # required +);</pre> + +<p>Invert each bit within a contiguous block.</p> + +<ul> +<li><b>offset</b> - Lower bound.</li> + +<li><b>length</b> - The number of bits to flip.</li> +</ul> + +<h3><a class='u' +name="count" +>count</a></h3> + +<pre>my $int = $bit_vector->count();</pre> + +<p>Return a count of the number of set bits.</p> + +<h3><a class='u' +name="to_array" +>to_array</a></h3> + +<pre>my $i32_array = $bit_vector->to_array();</pre> + +<p>Return an array where each element represents a set bit.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $bit_vector->clone();</pre> + +<p>Return a clone of the object.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Object::BitVector isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/Obj.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/Obj.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/Obj.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Object/Obj.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,18 @@ +Title: Lucy::Object::Obj â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Object::Obj - Moved.</p> + +<h2><a class='u' +name="MOVED" +>MOVED</a></h2> + +<p>Lucy::Object::Obj has been moved to Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/Architecture.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/Architecture.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/Architecture.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/Architecture.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,123 @@ +Title: Lucy::Plan::Architecture â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::Architecture - Configure major components of an index.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MyArchitecture; +use base qw( Lucy::Plan::Architecture ); + +use LucyX::Index::ZlibDocWriter; +use LucyX::Index::ZlibDocReader; + +sub register_doc_writer { + my ( $self, $seg_writer ) = @_; + my $doc_writer = LucyX::Index::ZlibDocWriter->new( + snapshot => $seg_writer->get_snapshot, + segment => $seg_writer->get_segment, + polyreader => $seg_writer->get_polyreader, + ); + $seg_writer->register( + api => "Lucy::Index::DocReader", + component => $doc_writer, + ); + $seg_writer->add_writer($doc_writer); +} + +sub register_doc_reader { + my ( $self, $seg_reader ) = @_; + my $doc_reader = LucyX::Index::ZlibDocReader->new( + schema => $seg_reader->get_schema, + folder => $seg_reader->get_folder, + segments => $seg_reader->get_segments, + seg_tick => $seg_reader->get_seg_tick, + snapshot => $seg_reader->get_snapshot, + ); + $seg_reader->register( + api => 'Lucy::Index::DocReader', + component => $doc_reader, + ); +} + +package MySchema; +use base qw( Lucy::Plan::Schema ); + +sub architecture { + shift; + return MyArchitecture->new(@_); +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>By default, +a Lucy index consists of several main parts: lexicon, +postings, +stored documents, +deletions, +and highlight data. +The readers and writers for that data are spawned by Architecture. +Each component operates at the segment level; Architecture’s factory methods are used to build up <a href="../../Lucy/Index/SegWriter.html" class="podlinkpod" +>SegWriter</a> and <a href="../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a>.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $arch = Lucy::Plan::Architecture->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="register_doc_writer" +>register_doc_writer</a></h3> + +<pre>$architecture->register_doc_writer($writer);</pre> + +<p>Spawn a DataWriter and <a href="../../Lucy/Index/SegWriter.html#register" class="podlinkpod" +>register()</a> it with the supplied SegWriter, +adding it to the SegWriter’s writer stack.</p> + +<ul> +<li><b>writer</b> - A SegWriter.</li> +</ul> + +<h3><a class='u' +name="register_doc_reader" +>register_doc_reader</a></h3> + +<pre>$architecture->register_doc_reader($reader);</pre> + +<p>Spawn a DocReader and register it with the supplied SegReader.</p> + +<ul> +<li><b>reader</b> - A SegReader.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::Architecture isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/BlobType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/BlobType.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/BlobType.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/BlobType.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,54 @@ +Title: Lucy::Plan::BlobType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::BlobType - Default behaviors for binary fields.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $string_type = Lucy::Plan::StringType->new; +my $blob_type = Lucy::Plan::BlobType->new( stored => 1 ); +my $schema = Lucy::Plan::Schema->new; +$schema->spec_field( name => 'id', type => $string_type ); +$schema->spec_field( name => 'jpeg', type => $blob_type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>BlobType is an implementation of FieldType tuned for use with fields containing binary data, +which cannot be indexed or searched – only stored.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $blob_type = Lucy::Plan::BlobType->new( + stored => 1, # default: false +);</pre> + +<p>Create a new BlobType.</p> + +<ul> +<li><b>stored</b> - boolean indicating whether the field should be stored.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::BlobType isa <a href="../../Lucy/Plan/FieldType.html" class="podlinkpod" +>Lucy::Plan::FieldType</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/FieldType.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/FieldType.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/FieldType.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Plan/FieldType.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,138 @@ +Title: Lucy::Plan::FieldType â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Plan::FieldType - Define a field’s behavior.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my @sortable; +for my $field ( @{ $schema->all_fields } ) { + my $type = $schema->fetch_type($field); + next unless $type->sortable; + push @sortable, $field; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>FieldType is an abstract class defining a set of traits and behaviors which may be associated with one or more field names.</p> + +<p>Properties which are common to all field types include <code>boost</code>, +<code>indexed</code>, +<code>stored</code>, +<code>sortable</code>, +<code>binary</code>, +and <code>similarity</code>.</p> + +<p>The <code>boost</code> property is a floating point scoring multiplier which defaults to 1.0. +Values greater than 1.0 cause the field to contribute more to a document’s score, +lower values, +less.</p> + +<p>The <code>indexed</code> property indicates whether the field should be indexed (so that it can be searched).</p> + +<p>The <code>stored</code> property indicates whether to store the raw field value, +so that it can be retrieved when a document turns up in a search.</p> + +<p>The <code>sortable</code> property indicates whether search results should be sortable based on the contents of the field.</p> + +<p>The <code>binary</code> property indicates whether the field contains binary or text data. +Unlike most other properties, +<code>binary</code> is not settable.</p> + +<p>The <code>similarity</code> property is a <a href="../../Lucy/Index/Similarity.html" class="podlinkpod" +>Similarity</a> object which defines matching and scoring behavior for the field. +It is required if the field is <code>indexed</code>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="set_boost" +>set_boost</a></h3> + +<pre>$field_type->set_boost($boost);</pre> + +<p>Setter for <code>boost</code>.</p> + +<h3><a class='u' +name="get_boost" +>get_boost</a></h3> + +<pre>my $float = $field_type->get_boost();</pre> + +<p>Accessor for <code>boost</code>.</p> + +<h3><a class='u' +name="set_indexed" +>set_indexed</a></h3> + +<pre>$field_type->set_indexed($indexed);</pre> + +<p>Setter for <code>indexed</code>.</p> + +<h3><a class='u' +name="indexed" +>indexed</a></h3> + +<pre>my $bool = $field_type->indexed();</pre> + +<p>Accessor for <code>indexed</code>.</p> + +<h3><a class='u' +name="set_stored" +>set_stored</a></h3> + +<pre>$field_type->set_stored($stored);</pre> + +<p>Setter for <code>stored</code>.</p> + +<h3><a class='u' +name="stored" +>stored</a></h3> + +<pre>my $bool = $field_type->stored();</pre> + +<p>Accessor for <code>stored</code>.</p> + +<h3><a class='u' +name="set_sortable" +>set_sortable</a></h3> + +<pre>$field_type->set_sortable($sortable);</pre> + +<p>Setter for <code>sortable</code>.</p> + +<h3><a class='u' +name="sortable" +>sortable</a></h3> + +<pre>my $bool = $field_type->sortable();</pre> + +<p>Accessor for <code>sortable</code>.</p> + +<h3><a class='u' +name="binary" +>binary</a></h3> + +<pre>my $bool = $field_type->binary();</pre> + +<p>Indicate whether the field contains binary data.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Plan::FieldType isa Clownfish::Obj.</p> + +</div>