Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/SnowballStopFilter.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,296 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Analysis::SnowballStopFilter â C API Documentation</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Lucy::Analysis::SnowballStopFilter</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>SNOWBALLSTOPFILTER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>SnowballStopFilter</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>SnowStop</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Analysis/SnowballStopFilter.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Analysis::SnowballStopFilter â Suppress a âstoplistâ of common words.</p> +<h3>Description</h3> +<p>A âstoplistâ is collection of âstopwordsâ: words which are common enough to +be of little value when determining search results. For example, so many +documents in English contain âtheâ, âifâ, and âmaybeâ that it may improve +both performance and relevance to block them.</p> +<p>Before filtering stopwords:</p> +<pre><code>("i", "am", "the", "walrus") +</code></pre> +<p>After filtering stopwords:</p> +<pre><code>("walrus") +</code></pre> +<p>SnowballStopFilter provides default stoplists for several languages, +courtesy of the <a href="http://snowball.tartarus.org">Snowball project</a>, or you may +supply your own.</p> +<pre><code>|-----------------------| +| ISO CODE | LANGUAGE | +|-----------------------| +| da | Danish | +| de | German | +| en | English | +| es | Spanish | +| fi | Finnish | +| fr | French | +| hu | Hungarian | +| it | Italian | +| nl | Dutch | +| no | Norwegian | +| pt | Portuguese | +| sv | Swedish | +| ru | Russian | +|-----------------------| +</code></pre> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>SnowballStopFilter* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_new</strong>( + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>language</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a> *<strong>stoplist</strong> +); +</code></pre> +<p>Create a new SnowballStopFilter.</p> +<dl> +<dt>stoplist</dt> +<dd><p>A hash with stopwords as the keys.</p> +</dd> +<dt>language</dt> +<dd><p>The ISO code for a supported language.</p> +</dd> +</dl> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>SnowballStopFilter* +<span class="prefix">lucy_</span><strong>SnowStop_init</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>language</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Hash.html">Hash</a> *<strong>stoplist</strong> +); +</code></pre> +<p>Initialize a SnowballStopFilter.</p> +<dl> +<dt>stoplist</dt> +<dd><p>A hash with stopwords as the keys.</p> +</dd> +<dt>language</dt> +<dd><p>The ISO code for a supported language.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Transform">Transform</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_Transform</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a> *<strong>inversion</strong> +); +</code></pre> +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html">Inversion</a> as input +and returns an Inversion, either the same one (presumably transformed +in some way), or a new one.</p> +<dl> +<dt>inversion</dt> +<dd><p>An inversion.</p> +</dd> +</dl> +</dd> +<dt id="func_Equals">Equals</dt> +<dd> +<pre><code>bool +<span class="prefix">lucy_</span><strong>SnowStop_Equals</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong> +); +</code></pre> +<p>Indicate whether two objects are the same. By default, compares the +memory address.</p> +<dl> +<dt>other</dt> +<dd><p>Another Obj.</p> +</dd> +</dl> +</dd> +<dt id="func_Dump">Dump</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_Dump</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong> +); +</code></pre> +<p>Dump the analyzer as hash.</p> +<p>Subclasses should call <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Dump">Dump()</a> on the superclass. The returned +object is a hash which should be populated with parameters of +the analyzer.</p> +<p><strong>Returns:</strong> A hash containing a description of the analyzer.</p> +</dd> +<dt id="func_Load">Load</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_Load</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>dump</strong> +); +</code></pre> +<p>Reconstruct an analyzer from a dump.</p> +<p>Subclasses should first call <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Load">Load()</a> on the superclass. The +returned object is an analyzer which should be reconstructed by +setting the dumped parameters from the hash contained in <code>dump</code>.</p> +<p>Note that the invocant analyzer is unused.</p> +<dl> +<dt>dump</dt> +<dd><p>A hash.</p> +</dd> +</dl> +<p><strong>Returns:</strong> An analyzer.</p> +</dd> +</dl> +<h4>Methods inherited from Lucy::Analysis::Analyzer</h4> +<dl> +<dt id="func_Transform_Text">Transform_Text</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_Transform_Text</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Kick off an analysis chain, creating an Inversion from string input. +The default implementation simply creates an initial Inversion with a +single Token, then calls <a href="../../Lucy/Analysis/SnowballStopFilter.html#func_Transform">Transform()</a>, but occasionally subclasses will +provide an optimized implementation which minimizes string copies.</p> +<dl> +<dt>text</dt> +<dd><p>A string.</p> +</dd> +</dl> +</dd> +<dt id="func_Split">Split</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>SnowStop_Split</strong>( + <span class="prefix">lucy_</span>SnowballStopFilter *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Analyze text and return an array of token texts.</p> +<dl> +<dt>text</dt> +<dd><p>A string.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Analysis::SnowballStopFilter is a <a href="../../Lucy/Analysis/Analyzer.html">Lucy::Analysis::Analyzer</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html>
Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/StandardTokenizer.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,250 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Analysis::StandardTokenizer â C API Documentation</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Lucy::Analysis::StandardTokenizer</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>STANDARDTOKENIZER</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>StandardTokenizer</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>StandardTokenizer</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Analysis/StandardTokenizer.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Analysis::StandardTokenizer â Split a string into tokens.</p> +<h3>Description</h3> +<p>Generically, âtokenizingâ is a process of breaking up a string into an +array of âtokensâ. For instance, the string âthree blind miceâ might be +tokenized into âthreeâ, âblindâ, âmiceâ.</p> +<p>Lucy::Analysis::StandardTokenizer breaks up the text at the word +boundaries defined in Unicode Standard Annex #29. It then returns those +words that contain alphabetic or numeric characters.</p> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>StandardTokenizer* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_new</strong>(void); +</code></pre> +<p>Constructor. Takes no arguments.</p> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>StandardTokenizer* +<span class="prefix">lucy_</span><strong>StandardTokenizer_init</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong> +); +</code></pre> +<p>Initialize a StandardTokenizer.</p> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Transform">Transform</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_Transform</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>, + <span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a> *<strong>inversion</strong> +); +</code></pre> +<p>Take a single <a href="../../Lucy/Analysis/Inversion.html">Inversion</a> as input +and returns an Inversion, either the same one (presumably transformed +in some way), or a new one.</p> +<dl> +<dt>inversion</dt> +<dd><p>An inversion.</p> +</dd> +</dl> +</dd> +<dt id="func_Transform_Text">Transform_Text</dt> +<dd> +<pre><code><span class="prefix">lucy_</span><a href="../../Lucy/Analysis/Inversion.html">Inversion</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_Transform_Text</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Kick off an analysis chain, creating an Inversion from string input. +The default implementation simply creates an initial Inversion with a +single Token, then calls <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Transform">Transform()</a>, but occasionally subclasses will +provide an optimized implementation which minimizes string copies.</p> +<dl> +<dt>text</dt> +<dd><p>A string.</p> +</dd> +</dl> +</dd> +<dt id="func_Equals">Equals</dt> +<dd> +<pre><code>bool +<span class="prefix">lucy_</span><strong>StandardTokenizer_Equals</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>other</strong> +); +</code></pre> +<p>Indicate whether two objects are the same. By default, compares the +memory address.</p> +<dl> +<dt>other</dt> +<dd><p>Another Obj.</p> +</dd> +</dl> +</dd> +</dl> +<h4>Methods inherited from Lucy::Analysis::Analyzer</h4> +<dl> +<dt id="func_Split">Split</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Vector.html">Vector</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_Split</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/String.html">String</a> *<strong>text</strong> +); +</code></pre> +<p>Analyze text and return an array of token texts.</p> +<dl> +<dt>text</dt> +<dd><p>A string.</p> +</dd> +</dl> +</dd> +<dt id="func_Dump">Dump</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_Dump</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong> +); +</code></pre> +<p>Dump the analyzer as hash.</p> +<p>Subclasses should call <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Dump">Dump()</a> on the superclass. The returned +object is a hash which should be populated with parameters of +the analyzer.</p> +<p><strong>Returns:</strong> A hash containing a description of the analyzer.</p> +</dd> +<dt id="func_Load">Load</dt> +<dd> +<pre><code><span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a>* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>StandardTokenizer_Load</strong>( + <span class="prefix">lucy_</span>StandardTokenizer *<strong>self</strong>, + <span class="prefix">cfish_</span><a href="../../Clownfish/Obj.html">Obj</a> *<strong>dump</strong> +); +</code></pre> +<p>Reconstruct an analyzer from a dump.</p> +<p>Subclasses should first call <a href="../../Lucy/Analysis/StandardTokenizer.html#func_Load">Load()</a> on the superclass. The +returned object is an analyzer which should be reconstructed by +setting the dumped parameters from the hash contained in <code>dump</code>.</p> +<p>Note that the invocant analyzer is unused.</p> +<dl> +<dt>dump</dt> +<dd><p>A hash.</p> +</dd> +</dl> +<p><strong>Returns:</strong> An analyzer.</p> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Analysis::StandardTokenizer is a <a href="../../Lucy/Analysis/Analyzer.html">Lucy::Analysis::Analyzer</a> is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Analysis/Token.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,282 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Analysis::Token â C API Documentation</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Analysis/">Analysis</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Lucy::Analysis::Token</h2> +<table> +<tr> +<td class="label">parcel</td> +<td><a href="../../lucy.html">Lucy</a></td> +</tr> +<tr> +<td class="label">class variable</td> +<td><code><span class="prefix">LUCY_</span>TOKEN</code></td> +</tr> +<tr> +<td class="label">struct symbol</td> +<td><code><span class="prefix">lucy_</span>Token</code></td> +</tr> +<tr> +<td class="label">class nickname</td> +<td><code><span class="prefix">lucy_</span>Token</code></td> +</tr> +<tr> +<td class="label">header file</td> +<td><code>Lucy/Analysis/Token.h</code></td> +</tr> +</table> +<h3>Name</h3> +<p>Lucy::Analysis::Token â Unit of text.</p> +<h3>Description</h3> +<p>Token is the fundamental unit used by Apache Lucyâs Analyzer subclasses. +Each Token has 5 attributes: <code>text</code>, <code>start_offset</code>, +<code>end_offset</code>, <code>boost</code>, and <code>pos_inc</code>.</p> +<p>The <code>text</code> attribute is a Unicode string encoded as UTF-8.</p> +<p><code>start_offset</code> is the start point of the token text, measured in +Unicode code points from the top of the stored field; +<code>end_offset</code> delimits the corresponding closing boundary. +<code>start_offset</code> and <code>end_offset</code> locate the Token +within a larger context, even if the Tokenâs text attribute gets modified +â by stemming, for instance. The Token for âbeatingâ in the text âbeating +a dead horseâ begins life with a start_offset of 0 and an end_offset of 7; +after stemming, the text is âbeatâ, but the start_offset is still 0 and the +end_offset is still 7. This allows âbeatingâ to be highlighted correctly +after a search matches âbeatâ.</p> +<p><code>boost</code> is a per-token weight. Use this when you want to assign +more or less importance to a particular token, as you might for emboldened +text within an HTML document, for example. (Note: The field this token +belongs to must be specâd to use a posting of type RichPosting.)</p> +<p><code>pos_inc</code> is the POSition INCrement, measured in Tokens. This +attribute, which defaults to 1, is a an advanced tool for manipulating +phrase matching. Ordinarily, Tokens are assigned consecutive position +numbers: 0, 1, and 2 for <code>"three blind mice"</code>. However, if you +set the position increment for âblindâ to, say, 1000, then the three tokens +will end up assigned to positions 0, 1, and 1001 â and will no longer +produce a phrase match for the query <code>"three blind mice"</code>.</p> +<h3>Functions</h3> +<dl> +<dt id="func_new">new</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Token* <span class="comment">// incremented</span> +<span class="prefix">lucy_</span><strong>Token_new</strong>( + char *<strong>text</strong>, + size_t <strong>len</strong>, + uint32_t <strong>start_offset</strong>, + uint32_t <strong>end_offset</strong>, + float <strong>boost</strong>, + int32_t <strong>pos_inc</strong> +); +</code></pre> +<p>Create a new Token.</p> +<dl> +<dt>text</dt> +<dd><p>A UTF-8 string.</p> +</dd> +<dt>len</dt> +<dd><p>Size of the string in bytes.</p> +</dd> +<dt>start_offset</dt> +<dd><p>Start offset into the original document in Unicode +code points.</p> +</dd> +<dt>start_offset</dt> +<dd><p>End offset into the original document in Unicode +code points.</p> +</dd> +<dt>boost</dt> +<dd><p>Per-token weight.</p> +</dd> +<dt>pos_inc</dt> +<dd><p>Position increment for phrase matching.</p> +</dd> +</dl> +</dd> +<dt id="func_init">init</dt> +<dd> +<pre><code><span class="prefix">lucy_</span>Token* +<span class="prefix">lucy_</span><strong>Token_init</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong>, + char *<strong>text</strong>, + size_t <strong>len</strong>, + uint32_t <strong>start_offset</strong>, + uint32_t <strong>end_offset</strong>, + float <strong>boost</strong>, + int32_t <strong>pos_inc</strong> +); +</code></pre> +<p>Initialize a Token.</p> +<dl> +<dt>text</dt> +<dd><p>A UTF-8 string.</p> +</dd> +<dt>len</dt> +<dd><p>Size of the string in bytes.</p> +</dd> +<dt>start_offset</dt> +<dd><p>Start offset into the original document in Unicode +code points.</p> +</dd> +<dt>start_offset</dt> +<dd><p>End offset into the original document in Unicode +code points.</p> +</dd> +<dt>boost</dt> +<dd><p>Per-token weight.</p> +</dd> +<dt>pos_inc</dt> +<dd><p>Position increment for phrase matching.</p> +</dd> +</dl> +</dd> +</dl> +<h3>Methods</h3> +<dl> +<dt id="func_Get_Start_Offset">Get_Start_Offset</dt> +<dd> +<pre><code>uint32_t +<span class="prefix">lucy_</span><strong>Token_Get_Start_Offset</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Get_End_Offset">Get_End_Offset</dt> +<dd> +<pre><code>uint32_t +<span class="prefix">lucy_</span><strong>Token_Get_End_Offset</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Get_Boost">Get_Boost</dt> +<dd> +<pre><code>float +<span class="prefix">lucy_</span><strong>Token_Get_Boost</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Get_Pos_Inc">Get_Pos_Inc</dt> +<dd> +<pre><code>int32_t +<span class="prefix">lucy_</span><strong>Token_Get_Pos_Inc</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Get_Text">Get_Text</dt> +<dd> +<pre><code>char* +<span class="prefix">lucy_</span><strong>Token_Get_Text</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Get_Len">Get_Len</dt> +<dd> +<pre><code>size_t +<span class="prefix">lucy_</span><strong>Token_Get_Len</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong> +); +</code></pre> +</dd> +<dt id="func_Set_Text">Set_Text</dt> +<dd> +<pre><code>void +<span class="prefix">lucy_</span><strong>Token_Set_Text</strong>( + <span class="prefix">lucy_</span>Token *<strong>self</strong>, + char *<strong>text</strong>, + size_t <strong>len</strong> +); +</code></pre> +</dd> +</dl> +<h3>Inheritance</h3> +<p>Lucy::Analysis::Token is a <a href="../../Clownfish/Obj.html">Clownfish::Obj</a>.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,120 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::Cookbook</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Apache Lucy recipes</h2> +<p>The Cookbook provides thematic documentation covering some of Apache Lucyâs +more sophisticated features. For a step-by-step introduction to Lucy, +see <a href="../../Lucy/Docs/Tutorial.html">Tutorial</a>.</p> +<h3>Chapters</h3> +<ul> +<li> +<p><a href="../../Lucy/Docs/Cookbook/FastUpdates.html">FastUpdates</a> - While index updates are fast on +average, worst-case update performance may be significantly slower. To make +index updates consistently quick, we must manually intervene to control the +process of index segment consolidation.</p> +</li> +<li> +<p><a href="../../Lucy/Docs/Cookbook/CustomQuery.html">CustomQuery</a> - Explore Lucyâs support for +custom query types by creating a âPrefixQueryâ class to handle trailing +wildcards.</p> +</li> +<li> +<p><a href="../../Lucy/Docs/Cookbook/CustomQueryParser.html">CustomQueryParser</a> - Define your own custom +search query syntax using <a href="../../Lucy/Search/QueryParser.html">QueryParser</a> and +Parse::RecDescent.</p> +</li> +</ul> +<h3>Materials</h3> +<p>Some of the recipes in the Cookbook reference the completed +<a href="../../Lucy/Docs/Tutorial.html">Tutorial</a> application. These materials can be +found in the <code>sample</code> directory at the root of the Lucy distribution:</p> +<pre><code>Code example for C is missing</code></pre> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQuery.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,190 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::Cookbook::CustomQuery</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Sample subclass of Query</h2> +<p>Explore Apache Lucyâs support for custom query types by creating a +âPrefixQueryâ class to handle trailing wildcards.</p> +<pre><code>Code example for C is missing</code></pre> +<h3>Query, Compiler, and Matcher</h3> +<p>To add support for a new query type, we need three classes: a Query, a +Compiler, and a Matcher.</p> +<ul> +<li> +<p>PrefixQuery - a subclass of <a href="../../../Lucy/Search/Query.html">Query</a>, and the only class +that client code will deal with directly.</p> +</li> +<li> +<p>PrefixCompiler - a subclass of <a href="../../../Lucy/Search/Compiler.html">Compiler</a>, whose primary +role is to compile a PrefixQuery to a PrefixMatcher.</p> +</li> +<li> +<p>PrefixMatcher - a subclass of <a href="../../../Lucy/Search/Matcher.html">Matcher</a>, which does the +heavy lifting: it applies the query to individual documents and assigns a +score to each match.</p> +</li> +</ul> +<p>The PrefixQuery class on its own isnât enough because a Query objectâs role is +limited to expressing an abstract specification for the search. A Query is +basically nothing but metadata; execution is left to the Queryâs companion +Compiler and Matcher.</p> +<p>Hereâs a simplified sketch illustrating how a Searcherâs hits() method ties +together the three classes.</p> +<pre><code>Code example for C is missing</code></pre> +<h4>PrefixQuery</h4> +<p>Our PrefixQuery class will have two attributes: a query string and a field +name.</p> +<pre><code>Code example for C is missing</code></pre> +<p>PrefixQueryâs constructor collects and validates the attributes.</p> +<pre><code>Code example for C is missing</code></pre> +<p>Since this is an inside-out class, weâll need a destructor:</p> +<pre><code>Code example for C is missing</code></pre> +<p>The equals() method determines whether two Queries are logically equivalent:</p> +<pre><code>Code example for C is missing</code></pre> +<p>The last thing weâll need is a make_compiler() factory method which kicks out +a subclass of <a href="../../../Lucy/Search/Compiler.html">Compiler</a>.</p> +<pre><code>Code example for C is missing</code></pre> +<h4>PrefixCompiler</h4> +<p>PrefixQueryâs make_compiler() method will be called internally at search-time +by objects which subclass <a href="../../../Lucy/Search/Searcher.html">Searcher</a> â such as +<a href="../../../Lucy/Search/IndexSearcher.html">IndexSearchers</a>.</p> +<p>A Searcher is associated with a particular collection of documents. These +documents may all reside in one index, as with IndexSearcher, or they may be +spread out across multiple indexes on one or more machines, as with +LucyX::Remote::ClusterSearcher.</p> +<p>Searcher objects have access to certain statistical information about the +collections they represent; for instance, a Searcher can tell you how many +documents are in the collectionâ¦</p> +<pre><code>Code example for C is missing</code></pre> +<p>⦠or how many documents a specific term appears in:</p> +<pre><code>Code example for C is missing</code></pre> +<p>Such information can be used by sophisticated Compiler implementations to +assign more or less heft to individual queries or sub-queries. However, weâre +not going to bother with weighting for this demo; weâll just assign a fixed +score of 1.0 to each matching document.</p> +<p>We donât need to write a constructor, as it will suffice to inherit new() from +Lucy::Search::Compiler. The only method we need to implement for +PrefixCompiler is make_matcher().</p> +<pre><code>Code example for C is missing</code></pre> +<p>PrefixCompiler gets access to a <a href="../../../Lucy/Index/SegReader.html">SegReader</a> +object when make_matcher() gets called. From the SegReader and its +sub-components <a href="../../../Lucy/Index/LexiconReader.html">LexiconReader</a> and +<a href="../../../Lucy/Index/PostingListReader.html">PostingListReader</a>, we acquire a +<a href="../../../Lucy/Index/Lexicon.html">Lexicon</a>, scan through the Lexiconâs unique +terms, and acquire a <a href="../../../Lucy/Index/PostingList.html">PostingList</a> for each +term that matches our prefix.</p> +<p>Each of these PostingList objects represents a set of documents which match +the query.</p> +<h4>PrefixMatcher</h4> +<p>The Matcher subclass is the most involved.</p> +<pre><code>Code example for C is missing</code></pre> +<p>The doc ids must be in order, or some will be ignored; hence the <code>sort</code> +above.</p> +<p>In addition to the constructor and destructor, there are three methods that +must be overridden.</p> +<p>next() advances the Matcher to the next valid matching doc.</p> +<pre><code>Code example for C is missing</code></pre> +<p>get_doc_id() returns the current document id, or 0 if the Matcher is +exhausted. (<a href="../../../Lucy/Docs/DocIDs.html">Document numbers</a> start at 1, so 0 is +a sentinel.)</p> +<pre><code>Code example for C is missing</code></pre> +<p>score() conveys the relevance score of the current match. Weâll just return a +fixed score of 1.0:</p> +<pre><code>Code example for C is missing</code></pre> +<h3>Usage</h3> +<p>To get a basic feel for PrefixQuery, insert the FlatQueryParser module +described in <a href="../../../Lucy/Docs/Cookbook/CustomQueryParser.html">CustomQueryParser</a> (which supports +PrefixQuery) into the search.cgi sample app.</p> +<pre><code>Code example for C is missing</code></pre> +<p>If youâre planning on using PrefixQuery in earnest, though, you may want to +change up analyzers to avoid stemming, because stemming â another approach to +prefix conflation â is not perfectly compatible with prefix searches.</p> +<pre><code>Code example for C is missing</code></pre> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/CustomQueryParser.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,165 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::Cookbook::CustomQueryParser</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Sample subclass of QueryParser.</h2> +<p>Implement a custom search query language using a subclass of +<a href="../../../Lucy/Search/QueryParser.html">QueryParser</a>.</p> +<h3>The language</h3> +<p>At first, our query language will support only simple term queries and phrases +delimited by double quotes. For simplicityâs sake, it will not support +parenthetical groupings, boolean operators, or prepended plus/minus. The +results for all subqueries will be unioned together â i.e. joined using an OR +â which is usually the best approach for small-to-medium-sized document +collections.</p> +<p>Later, weâll add support for trailing wildcards.</p> +<h3>Single-field parser</h3> +<p>Our initial parser implentation will generate queries against a single fixed +field, âcontentâ, and it will analyze text using a fixed choice of English +EasyAnalyzer. We wonât subclass Lucy::Search::QueryParser just yet.</p> +<pre><code>Code example for C is missing</code></pre> +<p>Some private helper subs for creating TermQuery and PhraseQuery objects will +help keep the size of our main parse() subroutine down:</p> +<pre><code>Code example for C is missing</code></pre> +<p>Our private _tokenize() method treats double-quote delimited material as a +single token and splits on whitespace everywhere else.</p> +<pre><code>Code example for C is missing</code></pre> +<p>The main parsing routine creates an array of tokens by calling _tokenize(), +runs the tokens through through the EasyAnalyzer, creates TermQuery or +PhraseQuery objects according to how many tokens emerge from the +EasyAnalyzerâs split() method, and adds each of the sub-queries to the primary +ORQuery.</p> +<pre><code>Code example for C is missing</code></pre> +<h3>Multi-field parser</h3> +<p>Most often, the end user will want their search query to match not only a +single âcontentâ field, but also âtitleâ and so on. To make that happen, we +have to turn queries such as thisâ¦</p> +<pre><code>foo AND NOT bar +</code></pre> +<p>⦠into the logical equivalent of this:</p> +<pre><code>(title:foo OR content:foo) AND NOT (title:bar OR content:bar) +</code></pre> +<p>Rather than continue with our own from-scratch parser class and write the +routines to accomplish that expansion, weâre now going to subclass Lucy::Search::QueryParser +and take advantage of some of its existing methods.</p> +<p>Our first parser implementation had the âcontentâ field name and the choice of +English EasyAnalyzer hard-coded for simplicity, but we donât need to do that +once we subclass Lucy::Search::QueryParser. QueryParserâs constructor â +which we will inherit, allowing us to eliminate our own constructor â +requires a Schema which conveys field +and Analyzer information, so we can just defer to that.</p> +<pre><code>Code example for C is missing</code></pre> +<p>Weâre also going to jettison our _make_term_query() and _make_phrase_query() +helper subs and chop our parse() subroutine way down. Our revised parse() +routine will generate Lucy::Search::LeafQuery objects instead of TermQueries +and PhraseQueries:</p> +<pre><code>Code example for C is missing</code></pre> +<p>The magic happens in QueryParserâs expand() method, which walks the ORQuery +object we supply to it looking for LeafQuery objects, and calls expand_leaf() +for each one it finds. expand_leaf() performs field-specific analysis, +decides whether each query should be a TermQuery or a PhraseQuery, and if +multiple fields are required, creates an ORQuery which mults out e.g. <code>foo</code> +into <code>(title:foo OR content:foo)</code>.</p> +<h3>Extending the query language</h3> +<p>To add support for trailing wildcards to our query language, we need to +override expand_leaf() to accommodate PrefixQuery, while deferring to the +parent class implementation on TermQuery and PhraseQuery.</p> +<pre><code>Code example for C is missing</code></pre> +<p>Ordinarily, those asterisks would have been stripped when running tokens +through the EasyAnalyzer â query strings containing âfoo*â would produce +TermQueries for the term âfooâ. Our override intercepts tokens with trailing +asterisks and processes them as PrefixQueries before <code>SUPER::expand_leaf</code> can +discard them, so that a search for âfoo*â can match âfoodâ, âfoosballâ, and so +on.</p> +<h3>Usage</h3> +<p>Insert our custom parser into the search.cgi sample app to get a feel for how +it behaves:</p> +<pre><code>Code example for C is missing</code></pre> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/Cookbook/FastUpdates.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,251 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::Cookbook::FastUpdates</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a> » <a href="/docs/0.5.0/c/Lucy/Docs/Cookbook/">Cookbook</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Near real-time index updates</h2> +<p>While index updates are fast on average, worst-case update performance may be +significantly slower. To make index updates consistently quick, we must +manually intervene to control the process of index segment consolidation.</p> +<h3>The problem</h3> +<p>Ordinarily, modifying an index is cheap. New data is added to new segments, +and the time to write a new segment scales more or less linearly with the +number of documents added during the indexing session.</p> +<p>Deletions are also cheap most of the time, because we donât remove documents +immediately but instead mark them as deleted, and adding the deletion mark is +cheap.</p> +<p>However, as new segments are added and the deletion rate for existing segments +increases, search-time performance slowly begins to degrade. At some point, +it becomes necessary to consolidate existing segments, rewriting their data +into a new segment.</p> +<p>If the recycled segments are small, the time it takes to rewrite them may not +be significant. Every once in a while, though, a large amount of data must be +rewritten.</p> +<h3>Procrastinating and playing catch-up</h3> +<p>The simplest way to force fast index updates is to avoid rewriting anything.</p> +<p>Indexer relies upon <a href="../../../Lucy/Index/IndexManager.html">IndexManager</a>âs +<a href="../../../Lucy/Index/IndexManager.html#func_Recycle">Recycle()</a> method to tell it which segments should +be consolidated. If we subclass IndexManager and override the method so that +it always returns an empty array, we get consistently quick performance:</p> +<pre><code class="language-c">Vector* +NoMergeManager_Recycle_IMP(IndexManager *self, PolyReader *reader, + DeletionsWriter *del_writer, int64_t cutoff, + bool optimize) { + return Vec_new(0); +} + +void +do_index(Obj *index) { + CFCClass *klass = Class_singleton("NoMergeManager", INDEXMANAGER); + Class_Override(klass, (cfish_method_t)NoMergeManager_Recycle_IMP, + LUCY_IndexManager_Recycle_OFFSET); + + IndexManager *manager = (IndexManager*)Class_Make_Obj(klass); + IxManager_init(manager, NULL, NULL); + + Indexer *indexer = Indexer_new(NULL, index, manager, 0); + ... + Indexer_Commit(indexer); + + DECREF(indexer); + DECREF(manager); +} +</code></pre> +<p>However, we canât procrastinate forever. Eventually, weâll have to run an +ordinary, uncontrolled indexing session, potentially triggering a large +rewrite of lots of small and/or degraded segments:</p> +<pre><code class="language-c">void +do_index(Obj *index) { + Indexer *indexer = Indexer_new(NULL, index, NULL /* manager */, 0); + ... + Indexer_Commit(indexer); + DECREF(indexer); +} +</code></pre> +<h3>Acceptable worst-case update time, slower degradation</h3> +<p>Never merging anything at all in the main indexing process is probably +overkill. Small segments are relatively cheap to merge; we just need to guard +against the big rewrites.</p> +<p>Setting a ceiling on the number of documents in the segments to be recycled +allows us to avoid a mass proliferation of tiny, single-document segments, +while still offering decent worst-case update speed:</p> +<pre><code class="language-c">Vector* +LightMergeManager_Recycle_IMP(IndexManager *self, PolyReader *reader, + DeletionsWriter *del_writer, int64_t cutoff, + bool optimize) { + IndexManager_Recycle_t super_recycle + = SUPER_METHOD_PTR(IndexManager, LUCY_IndexManager_Recycle); + Vector *seg_readers = super_recycle(self, reader, del_writer, cutoff, + optimize); + Vector *small_segments = Vec_new(0); + + for (size_t i = 0, max = Vec_Get_Size(seg_readers); i < max; i++) { + SegReader *seg_reader = (SegReader*)Vec_Fetch(seg_readers, i); + + if (SegReader_Doc_Max(seg_reader) < 10) { + Vec_Push(small_segments, INCREF(seg_reader)); + } + } + + DECREF(seg_readers); + return small_segments; +} +</code></pre> +<p>However, we still have to consolidate every once in a while, and while that +happens content updates will be locked out.</p> +<h3>Background merging</h3> +<p>If itâs not acceptable to lock out updates while the index consolidation +process runs, the alternative is to move the consolidation process out of +band, using <a href="../../../Lucy/Index/BackgroundMerger.html">BackgroundMerger</a>.</p> +<p>Itâs never safe to have more than one Indexer attempting to modify the content +of an index at the same time, but a BackgroundMerger and an Indexer can +operate simultaneously:</p> +<pre><code class="language-c">typedef struct { + Obj *index; + Doc *doc; +} Context; + +static void +S_index_doc(void *arg) { + Context *ctx = (Context*)arg; + + CFCClass *klass = Class_singleton("LightMergeManager", INDEXMANAGER); + Class_Override(klass, (cfish_method_t)LightMergeManager_Recycle_IMP, + LUCY_IndexManager_Recycle_OFFSET); + + IndexManager *manager = (IndexManager*)Class_Make_Obj(klass); + IxManager_init(manager, NULL, NULL); + + Indexer *indexer = Indexer_new(NULL, ctx->index, manager, 0); + Indexer_Add_Doc(indexer, ctx->doc, 1.0); + Indexer_Commit(indexer); + + DECREF(indexer); + DECREF(manager); +} + +void indexing_process(Obj *index, Doc *doc) { + Context ctx; + ctx.index = index; + ctx.doc = doc; + + for (int i = 0; i < max_retries; i++) { + Err *err = Err_trap(S_index_doc, &ctx); + if (!err) { break; } + if (!Err_is_a(err, LOCKERR)) { + RETHROW(err); + } + WARN("Couldn't get lock (%d retries)", i); + DECREF(err); + } +} + +void +background_merge_process(Obj *index) { + IndexManager *manager = IxManager_new(NULL, NULL); + IxManager_Set_Write_Lock_Timeout(manager, 60000); + + BackgroundMerger bg_merger = BGMerger_new(index, manager); + BGMerger_Commit(bg_merger); + + DECREF(bg_merger); + DECREF(manager); +} +</code></pre> +<p>The exception handling code becomes useful once you have more than one index +modification process happening simultaneously. By default, Indexer tries +several times to acquire a write lock over the span of one second, then holds +it until <a href="../../../Lucy/Index/Indexer.html#func_Commit">Commit()</a> completes. BackgroundMerger handles +most of its work +without the write lock, but it does need it briefly once at the beginning and +once again near the end. Under normal loads, the internal retry logic will +resolve conflicts, but if itâs not acceptable to miss an insert, you probably +want to catch <a href="../../../Lucy/Store/LockErr.html">LockErr</a> exceptions thrown by Indexer. In +contrast, a LockErr from BackgroundMerger probably just needs to be logged.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DevGuide.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,124 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::DevGuide</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Quick-start guide to hacking on Apache Lucy.</h2> +<p>The Apache Lucy code base is organized into roughly four layers:</p> +<ul> +<li>Charmonizer - compiler and OS configuration probing.</li> +<li>Clownfish - header files.</li> +<li>C - implementation files.</li> +<li>Host - binding language.</li> +</ul> +<p>Charmonizer is a configuration prober which writes a single header file, +âcharmony.hâ, describing the build environment and facilitating +cross-platform development. Itâs similar to Autoconf or Metaconfig, but +written in pure C.</p> +<p>The â.cfhâ files within the Lucy core are Clownfish header files. +Clownfish is a purpose-built, declaration-only language which superimposes +a single-inheritance object model on top of C which is specifically +designed to co-exist happily with variety of âhostâ languages and to allow +limited run-time dynamic subclassing. For more information see the +Clownfish docs, but if thereâs one thing you should know about Clownfish OO +before you start hacking, itâs that method calls are differentiated from +functions by capitalization:</p> +<pre><code>Indexer_Add_Doc <-- Method, typically uses dynamic dispatch. +Indexer_add_doc <-- Function, always a direct invocation. +</code></pre> +<p>The C files within the Lucy core are where most of Lucyâs low-level +functionality lies. They implement the interface defined by the Clownfish +header files.</p> +<p>The C core is intentionally left incomplete, however; to be usable, it must +be bound to a âhostâ language. (In this context, even C is considered a +âhostâ which must implement the missing pieces and be âboundâ to the core.) +Some of the binding code is autogenerated by Clownfish on a spec customized +for each language. Other pieces are hand-coded in either C (using the +hostâs C API) or the host language itself.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html> Added: websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html ============================================================================== --- websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html (added) +++ websites/staging/lucy/trunk/content/docs/0.5.0/c/Lucy/Docs/DocIDs.html Wed Sep 28 12:07:48 2016 @@ -0,0 +1,108 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> +<html lang="en"> + <head> + <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"> + <title>Lucy::Docs::DocIDs</title> + <link rel="stylesheet" type="text/css" media="screen" href="/css/lucy.css"> + </head> + + <body> + + <div id="lucy-rigid_wrapper"> + + <div id="lucy-top" class="container_16 lucy-white_box_3d"> + + <div id="lucy-logo_box" class="grid_8"> + <a href="/"><img src="/images/lucy_logo_150x100.png" alt="Apache Lucyâ¢"></a> + </div> <!-- lucy-logo_box --> + + <div #id="lucy-top_nav_box" class="grid_8"> + <div id="lucy-top_nav_bar" class="container_8"> + <ul> + <li><a href="http://www.apache.org/" title="Apache Software Foundation">Apache Software Foundation</a></li> + <li><a href="http://www.apache.org/licenses/" title="License">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html" title="Sponsorship">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html" title="Thanks">Thanks</a></li> + <li><a href="http://www.apache.org/security/ " title="Security">Security</a></li> + </ul> + </div> <!-- lucy-top_nav_bar --> + <p><a href="http://www.apache.org/">Apache</a> » <a href="/">Lucy</a> » <a href="/docs/">Docs</a> » <a href="/docs/0.5.0/">0.5.0</a> » <a href="/docs/0.5.0/c/">C</a> » <a href="/docs/0.5.0/c/Lucy/">Lucy</a> » <a href="/docs/0.5.0/c/Lucy/Docs/">Docs</a></p> + <form name="lucy-top_search_box" id="lucy-top_search_box" action="http://www.google.com/search" method="get"> + <input value="*.apache.org" name="sitesearch" type="hidden"/> + <input type="text" name="q" id="query" style="width:85%"> + <input type="submit" id="submit" value="Search"> + </form> + </div> <!-- lucy-top_nav_box --> + + <div class="clear"></div> + + </div> <!-- lucy-top --> + + <div id="lucy-main_content" class="container_16 lucy-white_box_3d"> + + <div class="grid_4" id="lucy-left_nav_box"> + <h6>About</h6> + <ul> + <li><a href="/">Welcome</a></li> + <li><a href="/clownfish.html">Clownfish</a></li> + <li><a href="/faq.html">FAQ</a></li> + <li><a href="/people.html">People</a></li> + </ul> + <h6>Resources</h6> + <ul> + <li><a href="/download.html">Download</a></li> + <li><a href="/mailing_lists.html">Mailing Lists</a></li> + <li><a href="/docs/">Documentation</a></li> + <li><a href="http://wiki.apache.org/lucy/">Wiki</a></li> + <li><a href="https://issues.apache.org/jira/browse/LUCY">Issue Tracker</a></li> + <li><a href="/version_control.html">Version Control</a></li> + </ul> + <h6>Related Projects</h6> + <ul> + <li><a href="http://lucene.apache.org/core/">Lucene</a></li> + <li><a href="http://dezi.org/">Dezi</a></li> + <li><a href="http://lucene.apache.org/solr/">Solr</a></li> + <li><a href="http://lucenenet.apache.org/">Lucene.NET</a></li> + <li><a href="http://lucene.apache.org/pylucene/">PyLucene</a></li> + </ul> + </div> <!-- lucy-left_nav_box --> + + <div id="lucy-main_content_box" class="grid_9"> + <div class="c-api"> +<h2>Characteristics of Apache Lucy document ids.</h2> +<h3>Document ids are signed 32-bit integers</h3> +<p>Document ids in Apache Lucy start at 1. Because 0 is never a valid doc id, we +can use it as a sentinel value:</p> +<pre><code>Code example for C is missing</code></pre> +<h3>Document ids are ephemeral</h3> +<p>The document ids used by Lucy are associated with a single index +snapshot. The moment an index is updated, the mapping of document ids to +documents is subject to change.</p> +<p>Since IndexReader objects represent a point-in-time view of an index, document +ids are guaranteed to remain static for the life of the reader. However, +because they are not permanent, Lucy document ids cannot be used as +foreign keys to locate records in external data sources. If you truly need a +primary key field, you must define it and populate it yourself.</p> +<p>Furthermore, the order of document ids does not tell you anything about the +sequence in which documents were added to the index.</p> +</div> + + </div> <!-- lucy-main_content_box --> + <div class="clear"></div> + + </div> <!-- lucy-main_content --> + + <div id="lucy-copyright" class="container_16"> + <p>Copyright © 2010-2015 The Apache Software Foundation, Licensed under the + <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>. + <br/> + Apache Lucy, Lucy, Apache, the Apache feather logo, and the Apache Lucy project logo are trademarks of The + Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their + respective owners. + </p> + </div> <!-- lucy-copyright --> + + </div> <!-- lucy-rigid_wrapper --> + + </body> +</html>
