attached are two patches:
1. documentation
2. regression tests
for headline with fragments.
-Sushant.
On Tue, 2008-07-15 at 13:29 +0400, Teodor Sigaev wrote:
> > Attached a new patch that:
> >
> > 1. fixes previous bug
> > 2. better handles the case when cover size is greater than the MaxWords.
>
> Looks good, I'll make some tests with real-world application.
>
> > I have not yet added the regression tests. The regression test suite
> > seemed to be only ensuring that the function works. How many tests
> > should I be adding? Is there any other place that I need to add
> > different test cases for the function?
>
> Just add 3-5 selects to src/test/regress/sql/tsearch.sql with checking basic
> functionality and corner cases like
> - there is no covers in text
> - Cover(s) is too big
> - and so on
>
> Add some words in documentation too, pls.
>
>
Index: doc/src/sgml/textsearch.sgml
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/doc/src/sgml/textsearch.sgml,v
retrieving revision 1.44
diff -c -r1.44 textsearch.sgml
*** doc/src/sgml/textsearch.sgml 16 May 2008 16:31:01 -0000 1.44
--- doc/src/sgml/textsearch.sgml 16 Jul 2008 02:37:28 -0000
***************
*** 1100,1105 ****
--- 1100,1117 ----
</listitem>
<listitem>
<para>
+ <literal>MaxFragments</literal>: maximum number of text excerpts
+ or fragments that matches the query words. It also triggers a
+ different headline generation function than the default one. This
+ function finds text fragments with as many query words as possible.
+ Each fragment will be of at most MaxWords and will not have words
+ of size less than or equal to ShortWord at the start or end of a
+ fragment. If all query words are not found in the document, then
+ a single fragment of MinWords will be displayed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
<literal>HighlightAll</literal>: Boolean flag; if
<literal>true</literal> the whole document will be highlighted.
</para>
***************
*** 1109,1115 ****
Any unspecified options receive these defaults:
<programlisting>
! StartSel=<b>, StopSel=</b>, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE
</programlisting>
</para>
--- 1121,1127 ----
Any unspecified options receive these defaults:
<programlisting>
! StartSel=<b>, StopSel=</b>, MaxFragments=0, MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE
</programlisting>
</para>
Index: src/test/regress/sql/tsearch.sql
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/sql/tsearch.sql,v
retrieving revision 1.9
diff -c -r1.9 tsearch.sql
*** src/test/regress/sql/tsearch.sql 16 May 2008 16:31:02 -0000 1.9
--- src/test/regress/sql/tsearch.sql 16 Jul 2008 03:45:24 -0000
***************
*** 208,213 ****
--- 208,253 ----
</html>',
to_tsquery('english', 'sea&foo'), 'HighlightAll=true');
+ --Check if headline fragments work
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean'), 'MaxFragments=1');
+
+ --Check if more than one fragments are displayed
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2');
+
+ --Fragments when there all query words are not in the document
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1');
+
+
--Rewrite sub system
CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);
Index: src/test/regress/expected/tsearch.out
===================================================================
RCS file: /home/postgres/devel/pgsql-cvs/pgsql/src/test/regress/expected/tsearch.out,v
retrieving revision 1.14
diff -c -r1.14 tsearch.out
*** src/test/regress/expected/tsearch.out 16 May 2008 16:31:02 -0000 1.14
--- src/test/regress/expected/tsearch.out 16 Jul 2008 03:47:46 -0000
***************
*** 632,637 ****
--- 632,705 ----
</html>
(1 row)
+ --Check if headline fragments work
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean'), 'MaxFragments=1');
+ ts_headline
+ -----------------------------------
+ ... stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted <b>Ocean</b>.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop
+ (1 row)
+
+ --Check if more than one fragments are displayed
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'Coleridge & stuck'), 'MaxFragments=2');
+ ts_headline
+ -------------------------------------------
+ ... after day, day after day,
+ We <b>stuck</b>, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water... every where,
+ Nor any drop to drink.
+ S. T. <b>Coleridge</b>
+ (1 row)
+
+ --Fragments when there all query words are not in the document
+ SELECT ts_headline('english', '
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as a painted Ship
+ Upon a painted Ocean.
+ Water, water, every where
+ And all the boards did shrink;
+ Water, water, every where,
+ Nor any drop to drink.
+ S. T. Coleridge (1772-1834)
+ ', to_tsquery('english', 'ocean & seahorse'), 'MaxFragments=1');
+ ts_headline
+ ------------------------------------
+
+ Day after day, day after day,
+ We stuck, nor breath nor motion,
+ As idle as
+ (1 row)
+
--Rewrite sub system
CREATE TABLE test_tsquery (txtkeyword TEXT, txtsample TEXT);
\set ECHO none
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers