attached are two patches:

1. documentation
2. regression tests
 for headline with fragments.

-Sushant.

On Tue, 2008-07-15 at 13:29 +0400, Teodor Sigaev wrote:
> > Attached a new patch that:
> > 
> > 1. fixes previous bug
> > 2. better handles the case when cover size is greater than the MaxWords. 
> 
> Looks good, I'll make some tests with  real-world application.
> 
> > I have not yet added the regression tests. The regression test suite 
> > seemed to be only ensuring that the function works. How many tests 
> > should I be adding? Is there any other place that I need to add 
> > different test cases for the function?
> 
> Just add 3-5 selects to src/test/regress/sql/tsearch.sql with checking basic 
> functionality and corner cases like
>   - there is no covers in text
>   - Cover(s) is too big
>   - and so on
> 
> Add some words in documentation too, pls.
> 
> 
Index: doc/src/sgml/textsearch.sgml
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/doc/src/sgml/textsearch.sgml,v
retrieving revision 1.44
diff -c -r1.44 textsearch.sgml
*** doc/src/sgml/textsearch.sgml	16 May 2008 16:31:01 -0000	1.44
--- doc/src/sgml/textsearch.sgml	16 Jul 2008 02:37:28 -0000
***************
*** 1100,1105 ****
--- 1100,1117 ----
       </listitem>
       <listitem>
        <para>
+        <literal>MaxFragments</literal>: maximum number of text excerpts 
+        or fragments that matches the query words. It also triggers a 
+        different headline generation function than the default one. This
+        function finds text fragments with as many query words as possible.
+        Each fragment will be of at most MaxWords and will not have words
+        of size less than or equal to ShortWord at the start or end of a 
+        fragment. If all query words are not found in the document, then
+        a single fragment of MinWords will be displayed.
+       </para>
+      </listitem>
+      <listitem>
+       <para>
         <literal>HighlightAll</literal>: Boolean flag;  if
         <literal>true</literal> the whole document will be highlighted.
        </para>
***************
*** 1109,1115 ****
      Any unspecified options receive these defaults:
  
  <programlisting>
! StartSel=&lt;b&gt;, StopSel=&lt;/b&gt;, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE
  </programlisting>
     </para>
  
--- 1121,1127 ----
      Any unspecified options receive these defaults:
  
  <programlisting>
! StartSel=&lt;b&gt;, StopSel=&lt;/b&gt;, MaxFragments=0, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE
  </programlisting>
     </para>
  
Index: src/test/regress/sql/tsearch.sql
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/sql/tsearch.sql,v
retrieving revision 1.9
diff -c -r1.9 tsearch.sql
*** src/test/regress/sql/tsearch.sql	16 May 2008 16:31:02 -0000	1.9
--- src/test/regress/sql/tsearch.sql	16 Jul 2008 03:45:24 -0000
***************
*** 208,213 ****
--- 208,253 ----
  </html>',
  to_tsquery('english', 'sea&foo'), 'HighlightAll=true');
  
+ --Check if headline fragments work 
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean'), 'MaxFragments=1');
+ 
+ --Check if more than one fragments are displayed
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2');
+ 
+ --Fragments when there all query words are not in the document
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1');
+ 
+ 
  --Rewrite sub system
  
  CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);
Index: src/test/regress/expected/tsearch.out
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/expected/tsearch.out,v
retrieving revision 1.14
diff -c -r1.14 tsearch.out
*** src/test/regress/expected/tsearch.out	16 May 2008 16:31:02 -0000	1.14
--- src/test/regress/expected/tsearch.out	16 Jul 2008 03:47:46 -0000
***************
*** 632,637 ****
--- 632,705 ----
   </html>
  (1 row)
  
+ --Check if headline fragments work 
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean'), 'MaxFragments=1');
+             ts_headline            
+ -----------------------------------
+  ... stuck, nor breath nor motion,
+  As idle as a painted Ship
+    Upon a painted <b>Ocean</b>.
+  Water, water, every where
+    And all the boards did shrink;
+  Water, water, every where,
+    Nor any drop
+ (1 row)
+ 
+ --Check if more than one fragments are displayed
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2');
+                 ts_headline                
+ -------------------------------------------
+  ... after day, day after day,
+    We <b>stuck</b>, nor breath nor motion,
+  As idle as a painted Ship
+    Upon a painted Ocean.
+  Water, water, every where
+    And all the boards did shrink;
+  Water, water... every where,
+    Nor any drop to drink.
+  S. T. <b>Coleridge</b>
+ (1 row)
+ 
+ --Fragments when there all query words are not in the document
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+   We stuck, nor breath nor motion,
+ As idle as a painted Ship
+   Upon a painted Ocean.
+ Water, water, every where
+   And all the boards did shrink;
+ Water, water, every where,
+   Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1');
+             ts_headline             
+ ------------------------------------
+  
+  Day after day, day after day,
+    We stuck, nor breath nor motion,
+  As idle as
+ (1 row)
+ 
  --Rewrite sub system
  CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);
  \set ECHO none
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to