I wrote:
> It's worth noting that with these rules, phrase searches will act as
> though "!x" always matches somewhere; for instance "!a <-> !b" will match
> any tsvector. I argue that this is not wrong, not even if the tsvector is
> empty: there could have been adjacent stopwords matching !a and !b in the
> original text. Since we've adjusted the phrase matching rules to treat
> stopwords as unknown-but-present words in a phrase, I think this is
> consistent. It's also pretty hard to assert this is wrong and at the same
> time accept "!a <-> b" matching b at the start of the document.
To clarify this point, I'm imagining that the patch would include
documentation changes like the attached.
regards, tom lane
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 67d0c34..464ce83 100644
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SELECT 'fat & rat & ! cat'::tsqu
*** 3959,3973 ****
tsquery
------------------------
'fat' & 'rat' & !'cat'
-
- SELECT '(fat | rat) <-> cat'::tsquery;
- tsquery
- -----------------------------------
- 'fat' <-> 'cat' | 'rat' <-> 'cat'
</programlisting>
-
- The last example demonstrates that <type>tsquery</type> sometimes
- rearranges nested operators into a logically equivalent formulation.
</para>
<para>
--- 3959,3965 ----
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 2da7595..bc33a70 100644
*** a/doc/src/sgml/textsearch.sgml
--- b/doc/src/sgml/textsearch.sgml
*************** text @@ text
*** 323,328 ****
--- 323,330 ----
at least one of its arguments must appear, while the <literal>!</> (NOT)
operator specifies that its argument must <emphasis>not</> appear in
order to have a match.
+ For example, the query <literal>fat & ! rat</> matches documents that
+ contain <literal>fat</> but not <literal>rat</>.
</para>
<para>
*************** SELECT phraseto_tsquery('the cats ate th
*** 377,382 ****
--- 379,401 ----
then <literal>&</literal>, then <literal><-></literal>,
and <literal>!</literal> most tightly.
</para>
+
+ <para>
+ It's worth noticing that the AND/OR/NOT operators mean something subtly
+ different when they are within the arguments of a FOLLOWED BY operator
+ than when they are not, because then the position of the match is
+ significant. Normally, <literal>!x</> matches only documents that do not
+ contain <literal>x</> anywhere. But <literal>x <-> !y</>
+ matches <literal>x</> if it is not immediately followed by <literal>y</>;
+ an occurrence of <literal>y</> elsewhere in the document does not prevent
+ a match. Another example is that <literal>x & y</> normally only
+ requires that <literal>x</> and <literal>y</> both appear somewhere in the
+ document, but <literal>(x & y) <-> z</> requires <literal>x</>
+ and <literal>y</> to match at the same place, immediately before
+ a <literal>z</>. Thus this query behaves differently from <literal>x
+ <-> z & y <-> z</>, which would match a document
+ containing two separate sequences <literal>x z</> and <literal>y z</>.
+ </para>
</sect2>
<sect2 id="textsearch-intro-configurations">
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers