Copied: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Docs/WritingClasses.mdtext (from r1762634, lucy/site/trunk/content/docs/perl/Clownfish/Docs/WritingClasses.mdtext) URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Docs/WritingClasses.mdtext?p2=lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Docs/WritingClasses.mdtext&p1=lucy/site/trunk/content/docs/perl/Clownfish/Docs/WritingClasses.mdtext&r1=1762634&r2=1762636&rev=1762636&view=diff ============================================================================== (empty)
Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Err.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Err.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Err.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Err.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,73 @@ +Title: Clownfish::Err â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Err - Exception.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MyErr; +use base qw( Clownfish::Err ); + +... + +package main; +use Scalar::Util qw( blessed ); +while (1) { + eval { + do_stuff() or MyErr->throw("retry"); + }; + if ( blessed($@) and $@->isa("MyErr") ) { + warn "Retrying...\n"; + } + else { + # Re-throw. + die "do_stuff() died: $@"; + } +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Clownfish::Err is the base class for exceptions in the Clownfish object hierarchy.</p> + +<p>The Err module also provides access to a per-thread Err shared variable via set_error() and get_error(). +It may be used to store an Err object temporarily, +so that calling code may choose how to handle a particular error condition.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="cat_mess" +>cat_mess</a></h3> + +<pre>$err->cat_mess($mess);</pre> + +<p>Concatenate the supplied argument onto the error message.</p> + +<h3><a class='u' +name="get_mess" +>get_mess</a></h3> + +<pre>my $string = $err->get_mess();</pre> + +<p>Return the error message.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::Err isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Float.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Float.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Float.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Float.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,91 @@ +Title: Clownfish::Float â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Float - Immutable double precision floating point number.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $float = Clownfish::Float->new(2.5); +my $value = $float->get_value;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $float = Clownfish::Float->new($value);</pre> + +<p>Return a new Float.</p> + +<ul> +<li><b>value</b> - Initial value.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_value" +>get_value</a></h3> + +<pre>my $result = $float->get_value();</pre> + +<p>Return the value of the Float.</p> + +<h3><a class='u' +name="to_i64" +>to_i64</a></h3> + +<pre>my $int = $float->to_i64();</pre> + +<p>Convert the Float to an integer, +truncating toward zero. +Throw an exception if the value is out of the range of an <code>int64_t</code>.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $float->compare_to($other);</pre> + +<p>Indicate whether one number is less than, +equal to, +or greater than another. +Throws an exception if <code>other</code> is neither a Float nor an Integer.</p> + +<p>Returns: 0 if the numbers are equal, +a negative number if <code>self</code> is less than <code>other</code>, +and a positive number if <code>self</code> is greater than <code>other</code>.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $float->clone();</pre> + +<p>Return a clone of the object.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::Float isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Hash.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Hash.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Hash.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Hash.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,124 @@ +Title: Clownfish::Hash â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Hash - Hashtable.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $hash = Clownfish::Hash->new; +$hash->store($key, $value); +my $value = $hash->fetch($key);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Values are stored by reference and may be any kind of Obj.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $hash = Clownfish::Hash->new( + capacity => $capacity # default: 0 +);</pre> + +<p>Return a new Hash.</p> + +<ul> +<li><b>capacity</b> - The number of elements that the hash will be asked to hold initially.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="clear" +>clear</a></h3> + +<pre>$hash->clear();</pre> + +<p>Empty the hash of all key-value pairs.</p> + +<h3><a class='u' +name="store" +>store</a></h3> + +<pre>$hash->store($key, $value);</pre> + +<p>Store a key-value pair.</p> + +<h3><a class='u' +name="fetch" +>fetch</a></h3> + +<pre>my $obj = $hash->fetch($key);</pre> + +<p>Fetch the value associated with <code>key</code>.</p> + +<p>Returns: the value, +or undef if <code>key</code> is not present.</p> + +<h3><a class='u' +name="delete" +>delete</a></h3> + +<pre>my $obj = $hash->delete($key);</pre> + +<p>Attempt to delete a key-value pair from the hash.</p> + +<p>Returns: the value if <code>key</code> exists and thus deletion succeeds; otherwise undef.</p> + +<h3><a class='u' +name="has_key" +>has_key</a></h3> + +<pre>my $bool = $hash->has_key($key);</pre> + +<p>Indicate whether the supplied <code>key</code> is present.</p> + +<h3><a class='u' +name="keys" +>keys</a></h3> + +<pre>my $arrayref = $hash->keys();</pre> + +<p>Return the Hash’s keys.</p> + +<h3><a class='u' +name="values" +>values</a></h3> + +<pre>my $arrayref = $hash->values();</pre> + +<p>Return the Hash’s values.</p> + +<h3><a class='u' +name="get_size" +>get_size</a></h3> + +<pre>my $int = $hash->get_size();</pre> + +<p>Return the number of key-value pairs.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::Hash isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/HashIterator.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/HashIterator.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/HashIterator.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/HashIterator.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,80 @@ +Title: Clownfish::HashIterator â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::HashIterator - Hashtable Iterator.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $iter = Clownfish::HashIterator->new($hash); +while ($iter->next) { + my $key = $iter->get_key; + my $value = $iter->get_value; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $iter = Clownfish::HashIterator->new($hash);</pre> + +<p>Return a HashIterator for <code>hash</code>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $bool = $hash_iterator->next();</pre> + +<p>Advance the iterator to the next key-value pair.</p> + +<p>Returns: true if there’s another key-value pair, +false if the iterator is exhausted.</p> + +<h3><a class='u' +name="get_key" +>get_key</a></h3> + +<pre>my $string = $hash_iterator->get_key();</pre> + +<p>Return the key of the current key-value pair. +It’s not allowed to call this method before <a href="#next" class="podlinkpod" +>next()</a> was called for the first time or after the iterator was exhausted.</p> + +<h3><a class='u' +name="get_value" +>get_value</a></h3> + +<pre>my $obj = $hash_iterator->get_value();</pre> + +<p>Return the value of the current key-value pair. +It’s not allowed to call this method before <a href="#next" class="podlinkpod" +>next()</a> was called for the first time or after the iterator was exhausted.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::HashIterator isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Integer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Integer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Integer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Integer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,89 @@ +Title: Clownfish::Integer â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Integer - Immutable 64-bit signed integer.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $integer = Clownfish::Integer->new(7); +my $value = $integer->get_value;</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $integer = Clownfish::Integer->new($value);</pre> + +<p>Return a new Integer.</p> + +<ul> +<li><b>value</b> - Initial value.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_value" +>get_value</a></h3> + +<pre>my $int = $integer->get_value();</pre> + +<p>Return the value of the Integer.</p> + +<h3><a class='u' +name="to_f64" +>to_f64</a></h3> + +<pre>my $float = $integer->to_f64();</pre> + +<p>Convert the Integer to floating point.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $integer->compare_to($other);</pre> + +<p>Indicate whether one number is less than, +equal to, +or greater than another. +Throws an exception if <code>other</code> is neither an Integer nor a Float.</p> + +<p>Returns: 0 if the numbers are equal, +a negative number if <code>self</code> is less than <code>other</code>, +and a positive number if <code>self</code> is greater than <code>other</code>.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $integer->clone();</pre> + +<p>Return a clone of the object.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::Integer isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Obj.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Obj.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Obj.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Obj.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,163 @@ +Title: Clownfish::Obj â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Obj - Base class for all objects.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>package MyObj; +use base qw( Clownfish::Obj ); + +# Inside-out member var. +my %foo; + +sub new { + my ( $class, %args ) = @_; + my $foo = delete $args{foo}; + my $self = $class->SUPER::new(%args); + $foo{$$self} = $foo; + return $self; +} + +sub get_foo { + my $self = shift; + return $foo{$$self}; +} + +sub DESTROY { + my $self = shift; + delete $foo{$$self}; + $self->SUPER::DESTROY; +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Clownfish::Obj is the base class of the Clownfish object hierarchy.</p> + +<p>From the standpoint of a Perl programmer, +all classes are implemented as blessed scalar references, +with the scalar storing a pointer to a C struct.</p> + +<h3><a class='u' +name="Subclassing" +>Subclassing</a></h3> + +<p>The recommended way to subclass Clownfish::Obj and its descendants is to use the inside-out design pattern. +(See <a href="../Class/InsideOut.html" class="podlinkpod" +>Class::InsideOut</a> for an introduction to inside-out techniques.)</p> + +<p>Since the blessed scalar stores a C pointer value which is unique per-object, +<code>$$self</code> can be used as an inside-out ID.</p> + +<pre># Accessor for 'foo' member variable. +sub get_foo { + my $self = shift; + return $foo{$$self}; +}</pre> + +<p>Caveats:</p> + +<ul> +<li>Inside-out aficionados will have noted that the "cached scalar id" stratagem recommended above isn't compatible with ithreads.</li> + +<li>Overridden methods must not return undef unless the API specifies that returning undef is permissible.</li> +</ul> + +<h2><a class='u' +name="CONSTRUCTOR" +>CONSTRUCTOR</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $self = $class->SUPER::new;</pre> + +<p>Abstract constructor -- must be invoked via a subclass. +Attempting to instantiate objects of class "Clownfish::Obj" directly causes an error.</p> + +<p>Takes no arguments; if any are supplied, +an error will be reported.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $obj->clone();</pre> + +<p>Return a clone of the object.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $obj->compare_to($other);</pre> + +<p>Indicate whether one object is less than, +equal to, +or greater than another.</p> + +<ul> +<li><b>other</b> - Another Obj.</li> +</ul> + +<p>Returns: 0 if the objects are equal, +a negative number if <code>self</code> is less than <code>other</code>, +and a positive number if <code>self</code> is greater than <code>other</code>.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="to_perl" +>to_perl</a></h3> + +<pre>my $native = $obj->to_perl;</pre> + +<p>Tries to convert the object to its native Perl representation.</p> + +<h3><a class='u' +name="equals" +>equals</a></h3> + +<pre>my $bool = $obj->equals($other);</pre> + +<p>Indicate whether two objects are the same. +By default, +compares the memory address.</p> + +<ul> +<li><b>other</b> - Another Obj.</li> +</ul> + +<h3><a class='u' +name="DESTROY" +>DESTROY</a></h3> + +<p>All Clownfish classes implement a DESTROY method; if you override it in a subclass, +you must call <code>$self->SUPER::DESTROY</code> to avoid leaking memory.</p> + +<h3><a class='u' +name="to_string" +>to_string</a></h3> + +<pre>my $string = $obj->to_string();</pre> + +<p>Generic stringification: “ClassName@hex_mem_address”.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/String.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/String.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/String.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/String.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,251 @@ +Title: Clownfish::String â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::String - Immutable string holding Unicode characters.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $string = Clownfish::String->new('abc'); +print $string->to_perl, "\n";</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $string = Clownfish::String->new($perl_string);</pre> + +<p>Return a String containing the passed-in Perl string.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="cat" +>cat</a></h3> + +<pre>my $result = $string->cat($other);</pre> + +<p>Return the concatenation of the String and <code>other</code>.</p> + +<h3><a class='u' +name="to_i64" +>to_i64</a></h3> + +<pre>my $int = $string->to_i64();</pre> + +<p>Extract a 64-bit integer from a decimal string. +See <a href="#basex_to_i64" class="podlinkpod" +>basex_to_i64()</a> for details.</p> + +<h3><a class='u' +name="basex_to_i64" +>basex_to_i64</a></h3> + +<pre>my $int = $string->basex_to_i64($base);</pre> + +<p>Extract a 64-bit integer from a variable-base stringified version. +Expects an optional minus sign followed by base-x digits, +stopping at any non-digit character. +Returns zero if no digits are found. +If the value exceeds the range of an <code>int64_t</code>, +the result is undefined.</p> + +<ul> +<li><b>base</b> - A base between 2 and 36.</li> +</ul> + +<h3><a class='u' +name="to_f64" +>to_f64</a></h3> + +<pre>my $float = $string->to_f64();</pre> + +<p>Convert a string to a floating-point number using the C library function <code>strtod</code>.</p> + +<h3><a class='u' +name="starts_with" +>starts_with</a></h3> + +<pre>my $bool = $string->starts_with($prefix);</pre> + +<p>Test whether the String starts with <code>prefix</code>.</p> + +<h3><a class='u' +name="ends_with" +>ends_with</a></h3> + +<pre>my $bool = $string->ends_with($suffix);</pre> + +<p>Test whether the String ends with <code>suffix</code>.</p> + +<h3><a class='u' +name="contains" +>contains</a></h3> + +<pre>my $bool = $string->contains($substring);</pre> + +<p>Test whether the String contains <code>substring</code>.</p> + +<h3><a class='u' +name="find" +>find</a></h3> + +<pre>my $string_iterator = $string->find($substring);</pre> + +<p>Return a <a href="../Clownfish/StringIterator.html" class="podlinkpod" +>StringIterator</a> pointing to the first occurrence of <code>substring</code> within the String, +or undef if the substring does not match.</p> + +<h3><a class='u' +name="length" +>length</a></h3> + +<pre>my $int = $string->length();</pre> + +<p>Return the number of Unicode code points the String contains.</p> + +<h3><a class='u' +name="get_size" +>get_size</a></h3> + +<pre>my $int = $string->get_size();</pre> + +<p>Return the number of bytes occupied by the String’s internal content.</p> + +<h3><a class='u' +name="to_bytebuf" +>to_bytebuf</a></h3> + +<pre>my $byte_buf = $string->to_bytebuf();</pre> + +<p>Return a ByteBuf which holds a copy of the String.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $string->clone();</pre> + +<p>Return a clone of the object.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $string->compare_to($other);</pre> + +<p>Indicate whether one String is less than, +equal to, +or greater than another. +The Unicode code points of the Strings are compared lexicographically. +Throws an exception if <code>other</code> is not a String.</p> + +<p>Returns: 0 if the Strings are equal, +a negative number if <code>self</code> is less than <code>other</code>, +and a positive number if <code>self</code> is greater than <code>other</code>.</p> + +<h3><a class='u' +name="trim" +>trim</a></h3> + +<pre>my $result = $string->trim();</pre> + +<p>Return a copy of the String with Unicode whitespace characters removed from both top and tail. +Whitespace is any character that has the Unicode property <code>White_Space</code>.</p> + +<h3><a class='u' +name="trim_top" +>trim_top</a></h3> + +<pre>my $result = $string->trim_top();</pre> + +<p>Return a copy of the String with leading Unicode whitespace removed. +Whitespace is any character that has the Unicode property <code>White_Space</code>.</p> + +<h3><a class='u' +name="trim_tail" +>trim_tail</a></h3> + +<pre>my $result = $string->trim_tail();</pre> + +<p>Return a copy of the String with trailing Unicode whitespace removed. +Whitespace is any character that has the Unicode property <code>White_Space</code>.</p> + +<h3><a class='u' +name="code_point_at" +>code_point_at</a></h3> + +<pre>my $int = $string->code_point_at($tick);</pre> + +<p>Return the Unicode code point located <code>tick</code> code points in from the top. +Return <code>CFISH_STR_OOB</code> if out of bounds.</p> + +<h3><a class='u' +name="code_point_from" +>code_point_from</a></h3> + +<pre>my $int = $string->code_point_from($tick);</pre> + +<p>Return the Unicode code point located <code>tick</code> code points counting backwards from the end. +Return <code>CFISH_STR_OOB</code> if out of bounds.</p> + +<h3><a class='u' +name="substring" +>substring</a></h3> + +<pre>my $result = $string->substring( + offset => $offset # required + length => $length # required +);</pre> + +<p>Return a new substring containing a copy of the specified range.</p> + +<ul> +<li><b>offset</b> - Offset from the top, +in code points.</li> + +<li><b>length</b> - The desired length of the substring, +in code points.</li> +</ul> + +<h3><a class='u' +name="top" +>top</a></h3> + +<pre>my $string_iterator = $string->top();</pre> + +<p>Return an iterator initialized to the start of the string.</p> + +<h3><a class='u' +name="tail" +>tail</a></h3> + +<pre>my $string_iterator = $string->tail();</pre> + +<p>Return an iterator initialized to the end of the string.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::String isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/StringIterator.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/StringIterator.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/StringIterator.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/StringIterator.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,171 @@ +Title: Clownfish::StringIterator â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::StringIterator - Iterate Unicode code points in a String.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $iter = $string->top; +while (my $code_point = $iter->next) { + ... +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $result = $string_iterator->clone();</pre> + +<p>Return a clone of the object.</p> + +<h3><a class='u' +name="assign" +>assign</a></h3> + +<pre>$string_iterator->assign($other);</pre> + +<p>Assign the source string and current position of <code>other</code> to <code>self</code>.</p> + +<h3><a class='u' +name="compare_to" +>compare_to</a></h3> + +<pre>my $int = $string_iterator->compare_to($other);</pre> + +<p>Indicate whether one StringIterator is less than, +equal to, +or greater than another by comparing their character positions. +Throws an exception if <code>other</code> is not a StringIterator pointing to the same source string as <code>self</code>.</p> + +<p>Returns: 0 if the StringIterators are equal, +a negative number if <code>self</code> is less than <code>other</code>, +and a positive number if <code>self</code> is greater than <code>other</code>.</p> + +<h3><a class='u' +name="has_next" +>has_next</a></h3> + +<pre>my $bool = $string_iterator->has_next();</pre> + +<p>Return true if the iterator is not at the end of the string.</p> + +<h3><a class='u' +name="has_prev" +>has_prev</a></h3> + +<pre>my $bool = $string_iterator->has_prev();</pre> + +<p>Return true if the iterator is not at the start of the string.</p> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $code_point = $iter->next;</pre> + +<p>Return the code point after the current position and advance the iterator. +Returns undef at the end of the string. +Returns zero but true for U+0000.</p> + +<h3><a class='u' +name="prev" +>prev</a></h3> + +<pre>my $code_point = $iter->prev;</pre> + +<p>Return the code point before the current position and go one step back. +Returns undef at the start of the string. +Returns zero but true for U+0000.</p> + +<h3><a class='u' +name="advance" +>advance</a></h3> + +<pre>my $int = $string_iterator->advance($num);</pre> + +<p>Skip code points.</p> + +<ul> +<li><b>num</b> - The number of code points to skip.</li> +</ul> + +<p>Returns: the number of code points actually skipped. +This can be less than the requested number if the end of the string is reached.</p> + +<h3><a class='u' +name="recede" +>recede</a></h3> + +<pre>my $int = $string_iterator->recede($num);</pre> + +<p>Skip code points backward.</p> + +<ul> +<li><b>num</b> - The number of code points to skip.</li> +</ul> + +<p>Returns: the number of code points actually skipped. +This can be less than the requested number if the start of the string is reached.</p> + +<h3><a class='u' +name="skip_whitespace" +>skip_whitespace</a></h3> + +<pre>my $int = $string_iterator->skip_whitespace();</pre> + +<p>Skip whitespace. +Whitespace is any character that has the Unicode property <code>White_Space</code>.</p> + +<p>Returns: the number of code points skipped.</p> + +<h3><a class='u' +name="skip_whitespace_back" +>skip_whitespace_back</a></h3> + +<pre>my $int = $string_iterator->skip_whitespace_back();</pre> + +<p>Skip whitespace backward. +Whitespace is any character that has the Unicode property <code>White_Space</code>.</p> + +<p>Returns: the number of code points skipped.</p> + +<h3><a class='u' +name="starts_with" +>starts_with</a></h3> + +<pre>my $bool = $string_iterator->starts_with($prefix);</pre> + +<p>Test whether the content after the iterator starts with <code>prefix</code>.</p> + +<h3><a class='u' +name="ends_with" +>ends_with</a></h3> + +<pre>my $bool = $string_iterator->ends_with($suffix);</pre> + +<p>Test whether the content before the iterator ends with <code>suffix</code>.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::StringIterator isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Vector.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Vector.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Vector.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Clownfish/Vector.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,207 @@ +Title: Clownfish::Vector â Apache Clownfish Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Clownfish::Vector - Variable-sized array.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $vector = Clownfish::Vector->new; +$vector->store($tick, $value); +my $value = $vector->fetch($tick);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $vector = Clownfish::Vector->new( + capacity => $capacity # default: 0 +);</pre> + +<p>Return a new Vector.</p> + +<ul> +<li><b>capacity</b> - Initial number of elements that the object will be able to hold before reallocation.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="push" +>push</a></h3> + +<pre>$vector->push($element); +$vector->push(); # default: undef</pre> + +<p>Push an item onto the end of a Vector.</p> + +<h3><a class='u' +name="push_all" +>push_all</a></h3> + +<pre>$vector->push_all($other);</pre> + +<p>Push all the elements of another Vector onto the end of this one.</p> + +<h3><a class='u' +name="pop" +>pop</a></h3> + +<pre>my $obj = $vector->pop();</pre> + +<p>Pop an item off of the end of a Vector.</p> + +<p>Returns: the element or undef if the Vector is empty.</p> + +<h3><a class='u' +name="insert" +>insert</a></h3> + +<pre>$vector->insert( + tick => $tick # required + element => $element # default: undef +);</pre> + +<p>Insert an element at <code>tick</code> moving the following elements.</p> + +<h3><a class='u' +name="insert_all" +>insert_all</a></h3> + +<pre>$vector->insert_all( + tick => $tick # required + other => $other # required +);</pre> + +<p>Inserts elements from <code>other</code> vector at <code>tick</code> moving the following elements.</p> + +<h3><a class='u' +name="fetch" +>fetch</a></h3> + +<pre>my $obj = $vector->fetch($tick);</pre> + +<p>Fetch the element at <code>tick</code>.</p> + +<p>Returns: the element or undef if <code>tick</code> is out of bounds.</p> + +<h3><a class='u' +name="store" +>store</a></h3> + +<pre>$vector->store($tick, $elem)</pre> + +<p>Store an element at index <code>tick</code>, +possibly displacing an existing element.</p> + +<h3><a class='u' +name="delete" +>delete</a></h3> + +<pre>my $obj = $vector->delete($tick);</pre> + +<p>Replace an element in the Vector with undef and return it.</p> + +<p>Returns: the element stored at <code>tick</code> or undef if <code>tick</code> is out of bounds.</p> + +<h3><a class='u' +name="excise" +>excise</a></h3> + +<pre>$vector->excise( + offset => $offset # required + length => $length # required +);</pre> + +<p>Remove <code>length</code> elements from the Vector, +starting at <code>offset</code>. +Move elements over to fill in the gap.</p> + +<h3><a class='u' +name="clone" +>clone</a></h3> + +<pre>my $arrayref = $vector->clone();</pre> + +<p>Clone the Vector but merely increment the refcounts of its elements rather than clone them.</p> + +<h3><a class='u' +name="sort" +>sort</a></h3> + +<pre>$vector->sort();</pre> + +<p>Sort the Vector. +Sort order is guaranteed to be <i>stable</i>: the relative order of elements which compare as equal will not change.</p> + +<h3><a class='u' +name="resize" +>resize</a></h3> + +<pre>$vector->resize($size);</pre> + +<p>Set the size for the Vector. +If the new size is larger than the current size, +grow the object to accommodate undef elements; if smaller than the current size, +decrement and discard truncated elements.</p> + +<h3><a class='u' +name="clear" +>clear</a></h3> + +<pre>$vector->clear();</pre> + +<p>Empty the Vector.</p> + +<h3><a class='u' +name="get_size" +>get_size</a></h3> + +<pre>my $int = $vector->get_size();</pre> + +<p>Return the size of the Vector.</p> + +<h3><a class='u' +name="slice" +>slice</a></h3> + +<pre>my $arrayref = $vector->slice( + offset => $offset # required + length => $length # required +);</pre> + +<p>Return a slice of the Vector consisting of elements from a contiguous range. +If the specified range is out of bounds, +return a slice with fewer elements – potentially none.</p> + +<ul> +<li><b>offset</b> - The index of the element to start at.</li> + +<li><b>length</b> - The maximum number of elements to slice.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Clownfish::Vector isa <a href="../Clownfish/Obj.html" class="podlinkpod" +>Clownfish::Obj</a>.</p> + +</div> Copied: lucy/site/trunk/content/docs/0.5.0/perl/Lucy.mdtext (from r1762634, lucy/site/trunk/content/docs/perl/Lucy.mdtext) URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy.mdtext?p2=lucy/site/trunk/content/docs/0.5.0/perl/Lucy.mdtext&p1=lucy/site/trunk/content/docs/perl/Lucy.mdtext&r1=1762634&r2=1762636&rev=1762636&view=diff ============================================================================== (empty) Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Analyzer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Analyzer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Analyzer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Analyzer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,143 @@ +Title: Lucy::Analysis::Analyzer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::Analyzer - Tokenize/modify/filter text.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre># Abstract base class.</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>An Analyzer is a filter which processes text, +transforming it from one form into another. +For instance, +an analyzer might break up a long text into smaller pieces (<a href="../../Lucy/Analysis/RegexTokenizer.html" class="podlinkpod" +>RegexTokenizer</a>), +or it might perform case folding to facilitate case-insensitive search (<a href="../../Lucy/Analysis/Normalizer.html" class="podlinkpod" +>Normalizer</a>).</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>package MyAnalyzer; +use base qw( Lucy::Analysis::Analyzer ); +our %foo; +sub new { + my $self = shift->SUPER::new; + my %args = @_; + $foo{$$self} = $args{foo}; + return $self; +}</pre> + +<p>Abstract constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="ABSTRACT_METHODS" +>ABSTRACT METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $analyzer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform_text" +>transform_text</a></h3> + +<pre>my $inversion = $analyzer->transform_text($text);</pre> + +<p>Kick off an analysis chain, +creating an Inversion from string input. +The default implementation simply creates an initial Inversion with a single Token, +then calls <a href="#transform" class="podlinkpod" +>transform()</a>, +but occasionally subclasses will provide an optimized implementation which minimizes string copies.</p> + +<ul> +<li><b>text</b> - A string.</li> +</ul> + +<h3><a class='u' +name="split" +>split</a></h3> + +<pre>my $arrayref = $analyzer->split($text);</pre> + +<p>Analyze text and return an array of token texts.</p> + +<ul> +<li><b>text</b> - A string.</li> +</ul> + +<h3><a class='u' +name="dump" +>dump</a></h3> + +<pre>my $obj = $analyzer->dump();</pre> + +<p>Dump the analyzer as hash.</p> + +<p>Subclasses should call <a href="#dump" class="podlinkpod" +>dump()</a> on the superclass. +The returned object is a hash which should be populated with parameters of the analyzer.</p> + +<p>Returns: A hash containing a description of the analyzer.</p> + +<h3><a class='u' +name="load" +>load</a></h3> + +<pre>my $obj = $analyzer->load($dump);</pre> + +<p>Reconstruct an analyzer from a dump.</p> + +<p>Subclasses should first call <a href="#load" class="podlinkpod" +>load()</a> on the superclass. +The returned object is an analyzer which should be reconstructed by setting the dumped parameters from the hash contained in <code>dump</code>.</p> + +<p>Note that the invocant analyzer is unused.</p> + +<ul> +<li><b>dump</b> - A hash.</li> +</ul> + +<p>Returns: An analyzer.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::Analyzer isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/CaseFolder.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/CaseFolder.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/CaseFolder.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/CaseFolder.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,73 @@ +Title: Lucy::Analysis::CaseFolder â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::CaseFolder - Normalize case, +facilitating case-insensitive search.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $case_folder = Lucy::Analysis::CaseFolder->new; + +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $case_folder, $stemmer ], +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>CaseFolder is DEPRECATED. +Use <a href="../../Lucy/Analysis/Normalizer.html" class="podlinkpod" +>Normalizer</a> instead.</p> + +<p>CaseFolder normalizes text according to Unicode case-folding rules, +so that searches will be case-insensitive.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $case_folder = Lucy::Analysis::CaseFolder->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $case_folder->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::CaseFolder isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/EasyAnalyzer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/EasyAnalyzer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/EasyAnalyzer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/EasyAnalyzer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,99 @@ +Title: Lucy::Analysis::EasyAnalyzer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::EasyAnalyzer - A simple analyzer chain.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $schema = Lucy::Plan::Schema->new; +my $analyzer = Lucy::Analysis::EasyAnalyzer->new( + language => 'en', +); +my $type = Lucy::Plan::FullTextType->new( + analyzer => $analyzer, +); +$schema->spec_field( name => 'title', type => $type ); +$schema->spec_field( name => 'content', type => $type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>EasyAnalyzer is an analyzer chain consisting of a <a href="../../Lucy/Analysis/StandardTokenizer.html" class="podlinkpod" +>StandardTokenizer</a>, +a <a href="../../Lucy/Analysis/Normalizer.html" class="podlinkpod" +>Normalizer</a>, +and a <a href="../../Lucy/Analysis/SnowballStemmer.html" class="podlinkpod" +>SnowballStemmer</a>.</p> + +<p>Supported languages:</p> + +<pre>en => English, +da => Danish, +de => German, +es => Spanish, +fi => Finnish, +fr => French, +hu => Hungarian, +it => Italian, +nl => Dutch, +no => Norwegian, +pt => Portuguese, +ro => Romanian, +ru => Russian, +sv => Swedish, +tr => Turkish,</pre> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $analyzer = Lucy::Analysis::EasyAnalyzer->new( + language => 'es', +);</pre> + +<p>Create a new EasyAnalyzer.</p> + +<ul> +<li><b>language</b> - An ISO code from the list of supported languages.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $easy_analyzer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::EasyAnalyzer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Inversion.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Inversion.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Inversion.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Inversion.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,87 @@ +Title: Lucy::Analysis::Inversion â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::Inversion - A collection of Tokens.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $result = Lucy::Analysis::Inversion->new; + +while (my $token = $inversion->next) { + $result->append($token); +}</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>An Inversion is a collection of Token objects which you can add to, +then iterate over.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $inversion = Lucy::Analysis::Inversion->new( + $seed, # optional +);</pre> + +<p>Create a new Inversion.</p> + +<ul> +<li><b>seed</b> - An initial Token to start things off, +which may be undef.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="append" +>append</a></h3> + +<pre>$inversion->append($token);</pre> + +<p>Tack a token onto the end of the Inversion.</p> + +<ul> +<li><b>token</b> - A Token.</li> +</ul> + +<h3><a class='u' +name="next" +>next</a></h3> + +<pre>my $token = $inversion->next();</pre> + +<p>Return the next token in the Inversion until out of tokens.</p> + +<h3><a class='u' +name="reset" +>reset</a></h3> + +<pre>$inversion->reset();</pre> + +<p>Reset the Inversion’s iterator, +so that the next call to next() returns the first Token in the inversion.</p> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::Inversion isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Normalizer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Normalizer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Normalizer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Normalizer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,92 @@ +Title: Lucy::Analysis::Normalizer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::Normalizer - Unicode normalization, +case folding and accent stripping.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $normalizer = Lucy::Analysis::Normalizer->new; + +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $normalizer, $stemmer ], +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Normalizer is an <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzer</a> which normalizes tokens to one of the Unicode normalization forms. +Optionally, +it performs Unicode case folding and converts accented characters to their base character.</p> + +<p>If you use highlighting, +Normalizer should be run after tokenization because it might add or remove characters.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $normalizer = Lucy::Analysis::Normalizer->new( + normalization_form => 'NFKC', + case_fold => 1, + strip_accents => 0, +);</pre> + +<p>Create a new Normalizer.</p> + +<ul> +<li><b>normalization_form</b> - Unicode normalization form, +can be one of ‘NFC’, +‘NFKC’, +‘NFD’, +‘NFKD’. +Defaults to ‘NFKC’.</li> + +<li><b>case_fold</b> - Perform case folding, +default is true.</li> + +<li><b>strip_accents</b> - Strip accents, +default is false.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $normalizer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::Normalizer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/PolyAnalyzer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/PolyAnalyzer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/PolyAnalyzer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/PolyAnalyzer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,134 @@ +Title: Lucy::Analysis::PolyAnalyzer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::PolyAnalyzer - Multiple Analyzers in series.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $schema = Lucy::Plan::Schema->new; +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => \@analyzers, +); +my $type = Lucy::Plan::FullTextType->new( + analyzer => $polyanalyzer, +); +$schema->spec_field( name => 'title', type => $type ); +$schema->spec_field( name => 'content', type => $type );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A PolyAnalyzer is a series of <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzers</a>, +each of which will be called upon to “analyze” text in turn. +You can either provide the Analyzers yourself, +or you can specify a supported language, +in which case a PolyAnalyzer consisting of a <a href="../../Lucy/Analysis/CaseFolder.html" class="podlinkpod" +>CaseFolder</a>, +a <a href="../../Lucy/Analysis/RegexTokenizer.html" class="podlinkpod" +>RegexTokenizer</a>, +and a <a href="../../Lucy/Analysis/SnowballStemmer.html" class="podlinkpod" +>SnowballStemmer</a> will be generated for you.</p> + +<p>The language parameter is DEPRECATED. +Use <a href="../../Lucy/Analysis/EasyAnalyzer.html" class="podlinkpod" +>EasyAnalyzer</a> instead.</p> + +<p>Supported languages:</p> + +<pre>en => English, +da => Danish, +de => German, +es => Spanish, +fi => Finnish, +fr => French, +hu => Hungarian, +it => Italian, +nl => Dutch, +no => Norwegian, +pt => Portuguese, +ro => Romanian, +ru => Russian, +sv => Swedish, +tr => Turkish,</pre> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $tokenizer = Lucy::Analysis::StandardTokenizer->new; +my $normalizer = Lucy::Analysis::Normalizer->new; +my $stemmer = Lucy::Analysis::SnowballStemmer->new( language => 'en' ); +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $normalizer, $stemmer, ], );</pre> + +<p>Create a new PolyAnalyzer.</p> + +<ul> +<li><b>language</b> - An ISO code from the list of supported languages. +DEPRECATED, +use <a href="../../Lucy/Analysis/EasyAnalyzer.html" class="podlinkpod" +>EasyAnalyzer</a> instead.</li> + +<li><b>analyzers</b> - An array of Analyzers. +The order of the analyzers matters. +Don’t put a SnowballStemmer before a RegexTokenizer (can’t stem whole documents or paragraphs – just individual words), +or a SnowballStopFilter after a SnowballStemmer (stemmed words, +e.g. +“themselv”, +will not appear in a stoplist). +In general, +the sequence should be: tokenize, +normalize, +stopalize, +stem.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_analyzers" +>get_analyzers</a></h3> + +<pre>my $arrayref = $poly_analyzer->get_analyzers();</pre> + +<p>Getter for “analyzers” member.</p> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $poly_analyzer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::PolyAnalyzer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/RegexTokenizer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/RegexTokenizer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/RegexTokenizer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/RegexTokenizer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,108 @@ +Title: Lucy::Analysis::RegexTokenizer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::RegexTokenizer - Split a string into tokens.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $whitespace_tokenizer + = Lucy::Analysis::RegexTokenizer->new( pattern => '\S+' ); + +# or... +my $word_char_tokenizer + = Lucy::Analysis::RegexTokenizer->new( pattern => '\w+' ); + +# or... +my $apostrophising_tokenizer = Lucy::Analysis::RegexTokenizer->new; + +# Then... once you have a tokenizer, put it into a PolyAnalyzer: +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $word_char_tokenizer, $normalizer, $stemmer ], );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Generically, +“tokenizing” is a process of breaking up a string into an array of “tokens”. +For instance, +the string “three blind mice” might be tokenized into “three”, +“blind”, +“mice”.</p> + +<p>Lucy::Analysis::RegexTokenizer decides where it should break up the text based on a regular expression compiled from a supplied <code>pattern</code> matching one token. +If our source string is…</p> + +<pre>"Eats, Shoots and Leaves."</pre> + +<p>… then a “whitespace tokenizer” with a <code>pattern</code> of <code>"\\S+"</code> produces…</p> + +<pre>Eats, +Shoots +and +Leaves.</pre> + +<p>… while a “word character tokenizer” with a <code>pattern</code> of <code>"\\w+"</code> produces…</p> + +<pre>Eats +Shoots +and +Leaves</pre> + +<p>… the difference being that the word character tokenizer skips over punctuation as well as whitespace when determining token boundaries.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $word_char_tokenizer = Lucy::Analysis::RegexTokenizer->new( + pattern => '\w+', # required +);</pre> + +<p>Create a new RegexTokenizer.</p> + +<ul> +<li><b>pattern</b> - A string specifying a Perl-syntax regular expression which should match one token. +The default value is <code>\w+(?:[\x{2019}']\w+)*</code>, +which matches “it’s” as well as “it” and “O’Henry’s” as well as “Henry”.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $regex_tokenizer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::RegexTokenizer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStemmer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStemmer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStemmer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStemmer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,78 @@ +Title: Lucy::Analysis::SnowballStemmer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::SnowballStemmer - Reduce related words to a shared root.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $stemmer = Lucy::Analysis::SnowballStemmer->new( language => 'es' ); + +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $normalizer, $stemmer ], +);</pre> + +<p>This class is a wrapper around the Snowball stemming library, +so it supports the same languages.</p> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>SnowballStemmer is an <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Analyzer</a> which reduces related words to a root form (using the “Snowball” stemming library). +For instance, +“horse”, +“horses”, +and “horsing” all become “hors” – so that a search for ‘horse’ will also match documents containing ‘horses’ and ‘horsing’.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $stemmer = Lucy::Analysis::SnowballStemmer->new( language => 'es' );</pre> + +<p>Create a new SnowballStemmer.</p> + +<ul> +<li><b>language</b> - A two-letter ISO code identifying a language supported by Snowball.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $snowball_stemmer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::SnowballStemmer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStopFilter.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStopFilter.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStopFilter.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/SnowballStopFilter.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,115 @@ +Title: Lucy::Analysis::SnowballStopFilter â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::SnowballStopFilter - Suppress a “stoplist” of common words.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $stopfilter = Lucy::Analysis::SnowballStopFilter->new( + language => 'fr', +); +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $normalizer, $stopfilter, $stemmer ], +);</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>A “stoplist” is collection of “stopwords”: words which are common enough to be of little value when determining search results. +For example, +so many documents in English contain “the”, +“if”, +and “maybe” that it may improve both performance and relevance to block them.</p> + +<p>Before filtering stopwords:</p> + +<pre>("i", "am", "the", "walrus")</pre> + +<p>After filtering stopwords:</p> + +<pre>("walrus")</pre> + +<p>SnowballStopFilter provides default stoplists for several languages, +courtesy of the <a href="http://snowball.tartarus.org" class="podlinkurl" +>Snowball project</a>, +or you may supply your own.</p> + +<pre>|-----------------------| +| ISO CODE | LANGUAGE | +|-----------------------| +| da | Danish | +| de | German | +| en | English | +| es | Spanish | +| fi | Finnish | +| fr | French | +| hu | Hungarian | +| it | Italian | +| nl | Dutch | +| no | Norwegian | +| pt | Portuguese | +| sv | Swedish | +| ru | Russian | +|-----------------------|</pre> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $stopfilter = Lucy::Analysis::SnowballStopFilter->new( + language => 'de', +); + +# or... +my $stopfilter = Lucy::Analysis::SnowballStopFilter->new( + stoplist => \%stoplist, +);</pre> + +<p>Create a new SnowballStopFilter.</p> + +<ul> +<li><b>stoplist</b> - A hash with stopwords as the keys.</li> + +<li><b>language</b> - The ISO code for a supported language.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $snowball_stop_filter->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::SnowballStopFilter isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/StandardTokenizer.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/StandardTokenizer.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/StandardTokenizer.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/StandardTokenizer.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,75 @@ +Title: Lucy::Analysis::StandardTokenizer â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::StandardTokenizer - Split a string into tokens.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre>my $tokenizer = Lucy::Analysis::StandardTokenizer->new; + +# Then... once you have a tokenizer, put it into a PolyAnalyzer: +my $polyanalyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ $tokenizer, $normalizer, $stemmer ], );</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Generically, +“tokenizing” is a process of breaking up a string into an array of “tokens”. +For instance, +the string “three blind mice” might be tokenized into “three”, +“blind”, +“mice”.</p> + +<p>Lucy::Analysis::StandardTokenizer breaks up the text at the word boundaries defined in Unicode Standard Annex #29. +It then returns those words that contain alphabetic or numeric characters.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $tokenizer = Lucy::Analysis::StandardTokenizer->new;</pre> + +<p>Constructor. +Takes no arguments.</p> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="transform" +>transform</a></h3> + +<pre>my $inversion = $standard_tokenizer->transform($inversion);</pre> + +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html" class="podlinkpod" +>Inversion</a> as input and returns an Inversion, +either the same one (presumably transformed in some way), +or a new one.</p> + +<ul> +<li><b>inversion</b> - An inversion.</li> +</ul> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::StandardTokenizer isa <a href="../../Lucy/Analysis/Analyzer.html" class="podlinkpod" +>Lucy::Analysis::Analyzer</a> isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Token.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Token.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Token.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Analysis/Token.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,154 @@ +Title: Lucy::Analysis::Token â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Analysis::Token - Unit of text.</p> + +<h2><a class='u' +name="SYNOPSIS" +>SYNOPSIS</a></h2> + +<pre> my $token = Lucy::Analysis::Token->new( + text => 'blind', + start_offset => 8, + end_offset => 13, + ); + + $token->set_text('mice');</pre> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Token is the fundamental unit used by Apache Lucy’s Analyzer subclasses. +Each Token has 5 attributes: <code>text</code>, +<code>start_offset</code>, +<code>end_offset</code>, +<code>boost</code>, +and <code>pos_inc</code>.</p> + +<p>The <code>text</code> attribute is a Unicode string encoded as UTF-8.</p> + +<p><code>start_offset</code> is the start point of the token text, +measured in Unicode code points from the top of the stored field; <code>end_offset</code> delimits the corresponding closing boundary. +<code>start_offset</code> and <code>end_offset</code> locate the Token within a larger context, +even if the Token’s text attribute gets modified – by stemming, +for instance. +The Token for “beating” in the text “beating a dead horse” begins life with a start_offset of 0 and an end_offset of 7; after stemming, +the text is “beat”, +but the start_offset is still 0 and the end_offset is still 7. +This allows “beating” to be highlighted correctly after a search matches “beat”.</p> + +<p><code>boost</code> is a per-token weight. +Use this when you want to assign more or less importance to a particular token, +as you might for emboldened text within an HTML document, +for example. +(Note: The field this token belongs to must be spec’d to use a posting of type RichPosting.)</p> + +<p><code>pos_inc</code> is the POSition INCrement, +measured in Tokens. +This attribute, +which defaults to 1, +is a an advanced tool for manipulating phrase matching. +Ordinarily, +Tokens are assigned consecutive position numbers: 0, +1, +and 2 for <code>"three blind mice"</code>. +However, +if you set the position increment for “blind” to, +say, +1000, +then the three tokens will end up assigned to positions 0, +1, +and 1001 – and will no longer produce a phrase match for the query <code>"three blind mice"</code>.</p> + +<h2><a class='u' +name="CONSTRUCTORS" +>CONSTRUCTORS</a></h2> + +<h3><a class='u' +name="new" +>new</a></h3> + +<pre>my $token = Lucy::Analysis::Token->new( + text => $text, # required + start_offset => $start_offset, # required + end_offset => $end_offset, # required + boost => 1.0, # optional + pos_inc => 1, # optional +);</pre> + +<ul> +<li><b>text</b> - A string.</li> + +<li><b>start_offset</b> - Start offset into the original document in Unicode code points.</li> + +<li><b>start_offset</b> - End offset into the original document in Unicode code points.</li> + +<li><b>boost</b> - Per-token weight.</li> + +<li><b>pos_inc</b> - Position increment for phrase matching.</li> +</ul> + +<h2><a class='u' +name="METHODS" +>METHODS</a></h2> + +<h3><a class='u' +name="get_text" +>get_text</a></h3> + +<pre>my $text = $token->get_text;</pre> + +<p>Get the token's text.</p> + +<h3><a class='u' +name="set_text" +>set_text</a></h3> + +<pre>$token->set_text($text);</pre> + +<p>Set the token's text.</p> + +<h3><a class='u' +name="get_start_offset" +>get_start_offset</a></h3> + +<pre>my $int = $token->get_start_offset();</pre> + +<h3><a class='u' +name="get_end_offset" +>get_end_offset</a></h3> + +<pre>my $int = $token->get_end_offset();</pre> + +<h3><a class='u' +name="get_boost" +>get_boost</a></h3> + +<pre>my $float = $token->get_boost();</pre> + +<h3><a class='u' +name="get_pos_inc" +>get_pos_inc</a></h3> + +<pre>my $int = $token->get_pos_inc();</pre> + +<h3><a class='u' +name="get_len" +>get_len</a></h3> + +<pre>my $int = $token->get_len();</pre> + +<h2><a class='u' +name="INHERITANCE" +>INHERITANCE</a></h2> + +<p>Lucy::Analysis::Token isa Clownfish::Obj.</p> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,52 @@ +Title: Lucy::Docs::Cookbook â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Docs::Cookbook - Apache Lucy recipes</p> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>The Cookbook provides thematic documentation covering some of Apache Lucy’s more sophisticated features. +For a step-by-step introduction to Lucy, +see <a href="../../Lucy/Docs/Tutorial.html" class="podlinkpod" +>Tutorial</a>.</p> + +<h3><a class='u' +name="Chapters" +>Chapters</a></h3> + +<ul> +<li><a href="../../Lucy/Docs/Cookbook/FastUpdates.html" class="podlinkpod" +>FastUpdates</a> - While index updates are fast on average, +worst-case update performance may be significantly slower. +To make index updates consistently quick, +we must manually intervene to control the process of index segment consolidation.</li> + +<li><a href="../../Lucy/Docs/Cookbook/CustomQuery.html" class="podlinkpod" +>CustomQuery</a> - Explore Lucy’s support for custom query types by creating a “PrefixQuery” class to handle trailing wildcards.</li> + +<li><a href="../../Lucy/Docs/Cookbook/CustomQueryParser.html" class="podlinkpod" +>CustomQueryParser</a> - Define your own custom search query syntax using <a href="../../Lucy/Search/QueryParser.html" class="podlinkpod" +>QueryParser</a> and Parse::RecDescent.</li> +</ul> + +<h3><a class='u' +name="Materials" +>Materials</a></h3> + +<p>Some of the recipes in the Cookbook reference the completed <a href="../../Lucy/Docs/Tutorial.html" class="podlinkpod" +>Tutorial</a> application. +These materials can be found in the <code>sample</code> directory at the root of the Lucy distribution:</p> + +<pre>sample/indexer.pl # indexing app +sample/search.cgi # search app +sample/us_constitution # corpus</pre> + +</div> Added: lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook/CustomQuery.mdtext URL: http://svn.apache.org/viewvc/lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook/CustomQuery.mdtext?rev=1762636&view=auto ============================================================================== --- lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook/CustomQuery.mdtext (added) +++ lucy/site/trunk/content/docs/0.5.0/perl/Lucy/Docs/Cookbook/CustomQuery.mdtext Wed Sep 28 12:06:24 2016 @@ -0,0 +1,321 @@ +Title: Lucy::Docs::Cookbook::CustomQuery â Apache Lucy Documentation + +<div> +<a name='___top' class='dummyTopAnchor' ></a> + +<h2><a class='u' +name="NAME" +>NAME</a></h2> + +<p>Lucy::Docs::Cookbook::CustomQuery - Sample subclass of Query</p> + +<h2><a class='u' +name="DESCRIPTION" +>DESCRIPTION</a></h2> + +<p>Explore Apache Lucy’s support for custom query types by creating a “PrefixQuery” class to handle trailing wildcards.</p> + +<pre>my $prefix_query = PrefixQuery->new( + field => 'content', + query_string => 'foo*', +); +my $hits = $searcher->hits( query => $prefix_query ); +...</pre> + +<h3><a class='u' +name="Query,_Compiler,_and_Matcher" +>Query, +Compiler, +and Matcher</a></h3> + +<p>To add support for a new query type, +we need three classes: a Query, +a Compiler, +and a Matcher.</p> + +<ul> +<li>PrefixQuery - a subclass of <a href="../../../Lucy/Search/Query.html" class="podlinkpod" +>Query</a>, +and the only class that client code will deal with directly.</li> + +<li>PrefixCompiler - a subclass of <a href="../../../Lucy/Search/Compiler.html" class="podlinkpod" +>Compiler</a>, +whose primary role is to compile a PrefixQuery to a PrefixMatcher.</li> + +<li>PrefixMatcher - a subclass of <a href="../../../Lucy/Search/Matcher.html" class="podlinkpod" +>Matcher</a>, +which does the heavy lifting: it applies the query to individual documents and assigns a score to each match.</li> +</ul> + +<p>The PrefixQuery class on its own isn’t enough because a Query object’s role is limited to expressing an abstract specification for the search. +A Query is basically nothing but metadata; execution is left to the Query’s companion Compiler and Matcher.</p> + +<p>Here’s a simplified sketch illustrating how a Searcher’s hits() method ties together the three classes.</p> + +<pre>sub hits { + my ( $self, $query ) = @_; + my $compiler = $query->make_compiler( + searcher => $self, + boost => $query->get_boost, + ); + my $matcher = $compiler->make_matcher( + reader => $self->get_reader, + need_score => 1, + ); + my @hits = $matcher->capture_hits; + return \@hits; +}</pre> + +<h4><a class='u' +name="PrefixQuery" +>PrefixQuery</a></h4> + +<p>Our PrefixQuery class will have two attributes: a query string and a field name.</p> + +<pre>package PrefixQuery; +use base qw( Lucy::Search::Query ); +use Carp; +use Scalar::Util qw( blessed ); + +# Inside-out member vars and hand-rolled accessors. +my %query_string; +my %field; +sub get_query_string { my $self = shift; return $query_string{$$self} } +sub get_field { my $self = shift; return $field{$$self} }</pre> + +<p>PrefixQuery’s constructor collects and validates the attributes.</p> + +<pre>sub new { + my ( $class, %args ) = @_; + my $query_string = delete $args{query_string}; + my $field = delete $args{field}; + my $self = $class->SUPER::new(%args); + confess("'query_string' param is required") + unless defined $query_string; + confess("Invalid query_string: '$query_string'") + unless $query_string =~ /\*\s*$/; + confess("'field' param is required") + unless defined $field; + $query_string{$$self} = $query_string; + $field{$$self} = $field; + return $self; +}</pre> + +<p>Since this is an inside-out class, +we’ll need a destructor:</p> + +<pre>sub DESTROY { + my $self = shift; + delete $query_string{$$self}; + delete $field{$$self}; + $self->SUPER::DESTROY; +}</pre> + +<p>The equals() method determines whether two Queries are logically equivalent:</p> + +<pre>sub equals { + my ( $self, $other ) = @_; + return 0 unless blessed($other); + return 0 unless $other->isa("PrefixQuery"); + return 0 unless $field{$$self} eq $field{$$other}; + return 0 unless $query_string{$$self} eq $query_string{$$other}; + return 1; +}</pre> + +<p>The last thing we’ll need is a make_compiler() factory method which kicks out a subclass of <a href="../../../Lucy/Search/Compiler.html" class="podlinkpod" +>Compiler</a>.</p> + +<pre>sub make_compiler { + my ( $self, %args ) = @_; + my $subordinate = delete $args{subordinate}; + my $compiler = PrefixCompiler->new( %args, parent => $self ); + $compiler->normalize unless $subordinate; + return $compiler; +}</pre> + +<h4><a class='u' +name="PrefixCompiler" +>PrefixCompiler</a></h4> + +<p>PrefixQuery’s make_compiler() method will be called internally at search-time by objects which subclass <a href="../../../Lucy/Search/Searcher.html" class="podlinkpod" +>Searcher</a> – such as <a href="../../../Lucy/Search/IndexSearcher.html" class="podlinkpod" +>IndexSearchers</a>.</p> + +<p>A Searcher is associated with a particular collection of documents. +These documents may all reside in one index, +as with IndexSearcher, +or they may be spread out across multiple indexes on one or more machines, +as with LucyX::Remote::ClusterSearcher.</p> + +<p>Searcher objects have access to certain statistical information about the collections they represent; for instance, +a Searcher can tell you how many documents are in the collection…</p> + +<pre>my $maximum_number_of_docs_in_collection = $searcher->doc_max;</pre> + +<p>… or how many documents a specific term appears in:</p> + +<pre>my $term_appears_in_this_many_docs = $searcher->doc_freq( + field => 'content', + term => 'foo', +);</pre> + +<p>Such information can be used by sophisticated Compiler implementations to assign more or less heft to individual queries or sub-queries. +However, +we’re not going to bother with weighting for this demo; we’ll just assign a fixed score of 1.0 to each matching document.</p> + +<p>We don’t need to write a constructor, +as it will suffice to inherit new() from Lucy::Search::Compiler. +The only method we need to implement for PrefixCompiler is make_matcher().</p> + +<pre>package PrefixCompiler; +use base qw( Lucy::Search::Compiler ); + +sub make_matcher { + my ( $self, %args ) = @_; + my $seg_reader = $args{reader}; + + # Retrieve low-level components LexiconReader and PostingListReader. + my $lex_reader + = $seg_reader->obtain("Lucy::Index::LexiconReader"); + my $plist_reader + = $seg_reader->obtain("Lucy::Index::PostingListReader"); + + # Acquire a Lexicon and seek it to our query string. + my $substring = $self->get_parent->get_query_string; + $substring =~ s/\*.\s*$//; + my $field = $self->get_parent->get_field; + my $lexicon = $lex_reader->lexicon( field => $field ); + return unless $lexicon; + $lexicon->seek($substring); + + # Accumulate PostingLists for each matching term. + my @posting_lists; + while ( defined( my $term = $lexicon->get_term ) ) { + last unless $term =~ /^\Q$substring/; + my $posting_list = $plist_reader->posting_list( + field => $field, + term => $term, + ); + if ($posting_list) { + push @posting_lists, $posting_list; + } + last unless $lexicon->next; + } + return unless @posting_lists; + + return PrefixMatcher->new( posting_lists => \@posting_lists ); +}</pre> + +<p>PrefixCompiler gets access to a <a href="../../../Lucy/Index/SegReader.html" class="podlinkpod" +>SegReader</a> object when make_matcher() gets called. +From the SegReader and its sub-components <a href="../../../Lucy/Index/LexiconReader.html" class="podlinkpod" +>LexiconReader</a> and <a href="../../../Lucy/Index/PostingListReader.html" class="podlinkpod" +>PostingListReader</a>, +we acquire a <a href="../../../Lucy/Index/Lexicon.html" class="podlinkpod" +>Lexicon</a>, +scan through the Lexicon’s unique terms, +and acquire a <a href="../../../Lucy/Index/PostingList.html" class="podlinkpod" +>PostingList</a> for each term that matches our prefix.</p> + +<p>Each of these PostingList objects represents a set of documents which match the query.</p> + +<h4><a class='u' +name="PrefixMatcher" +>PrefixMatcher</a></h4> + +<p>The Matcher subclass is the most involved.</p> + +<pre>package PrefixMatcher; +use base qw( Lucy::Search::Matcher ); + +# Inside-out member vars. +my %doc_ids; +my %tick; + +sub new { + my ( $class, %args ) = @_; + my $posting_lists = delete $args{posting_lists}; + my $self = $class->SUPER::new(%args); + + # Cheesy but simple way of interleaving PostingList doc sets. + my %all_doc_ids; + for my $posting_list (@$posting_lists) { + while ( my $doc_id = $posting_list->next ) { + $all_doc_ids{$doc_id} = undef; + } + } + my @doc_ids = sort { $a <=> $b } keys %all_doc_ids; + $doc_ids{$$self} = \@doc_ids; + + # Track our position within the array of doc ids. + $tick{$$self} = -1; + + return $self; +} + +sub DESTROY { + my $self = shift; + delete $doc_ids{$$self}; + delete $tick{$$self}; + $self->SUPER::DESTROY; +}</pre> + +<p>The doc ids must be in order, +or some will be ignored; hence the <code>sort</code> above.</p> + +<p>In addition to the constructor and destructor, +there are three methods that must be overridden.</p> + +<p>next() advances the Matcher to the next valid matching doc.</p> + +<pre>sub next { + my $self = shift; + my $doc_ids = $doc_ids{$$self}; + my $tick = ++$tick{$$self}; + return 0 if $tick >= scalar @$doc_ids; + return $doc_ids->[$tick]; +}</pre> + +<p>get_doc_id() returns the current document id, +or 0 if the Matcher is exhausted. +(<a href="../../../Lucy/Docs/DocIDs.html" class="podlinkpod" +>Document numbers</a> start at 1, +so 0 is a sentinel.)</p> + +<pre>sub get_doc_id { + my $self = shift; + my $tick = $tick{$$self}; + my $doc_ids = $doc_ids{$$self}; + return $tick < scalar @$doc_ids ? $doc_ids->[$tick] : 0; +}</pre> + +<p>score() conveys the relevance score of the current match. +We’ll just return a fixed score of 1.0:</p> + +<pre>sub score { 1.0 }</pre> + +<h3><a class='u' +name="Usage" +>Usage</a></h3> + +<p>To get a basic feel for PrefixQuery, +insert the FlatQueryParser module described in <a href="../../../Lucy/Docs/Cookbook/CustomQueryParser.html" class="podlinkpod" +>CustomQueryParser</a> (which supports PrefixQuery) into the search.cgi sample app.</p> + +<pre>my $parser = FlatQueryParser->new( schema => $searcher->get_schema ); +my $query = $parser->parse($q);</pre> + +<p>If you’re planning on using PrefixQuery in earnest, +though, +you may want to change up analyzers to avoid stemming, +because stemming – another approach to prefix conflation – is not perfectly compatible with prefix searches.</p> + +<pre># Polyanalyzer with no SnowballStemmer. +my $analyzer = Lucy::Analysis::PolyAnalyzer->new( + analyzers => [ + Lucy::Analysis::StandardTokenizer->new, + Lucy::Analysis::Normalizer->new, + ], +);</pre> + +</div>