I wrote:
> It's worth noting that with these rules, phrase searches will act as
> though "!x" always matches somewhere; for instance "!a <-> !b" will match
> any tsvector.  I argue that this is not wrong, not even if the tsvector is
> empty: there could have been adjacent stopwords matching !a and !b in the
> original text.  Since we've adjusted the phrase matching rules to treat
> stopwords as unknown-but-present words in a phrase, I think this is
> consistent.  It's also pretty hard to assert this is wrong and at the same
> time accept "!a <-> b" matching b at the start of the document.

To clarify this point, I'm imagining that the patch would include
documentation changes like the attached.

                        regards, tom lane

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 67d0c34..464ce83 100644
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SELECT 'fat &amp; rat &amp; ! cat'::tsqu
*** 3959,3973 ****
          tsquery         
  ------------------------
   'fat' &amp; 'rat' &amp; !'cat'
- 
- SELECT '(fat | rat) &lt;-&gt; cat'::tsquery;
-               tsquery
- -----------------------------------
-  'fat' &lt;-&gt; 'cat' | 'rat' &lt;-&gt; 'cat'
  </programlisting>
- 
-      The last example demonstrates that <type>tsquery</type> sometimes
-      rearranges nested operators into a logically equivalent formulation.
      </para>
  
      <para>
--- 3959,3965 ----
diff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 2da7595..bc33a70 100644
*** a/doc/src/sgml/textsearch.sgml
--- b/doc/src/sgml/textsearch.sgml
*************** text @@ text
*** 323,328 ****
--- 323,330 ----
      at least one of its arguments must appear, while the <literal>!</> (NOT)
      operator specifies that its argument must <emphasis>not</> appear in
      order to have a match.
+     For example, the query <literal>fat &amp; ! rat</> matches documents that
+     contain <literal>fat</> but not <literal>rat</>.
     </para>
  
     <para>
*************** SELECT phraseto_tsquery('the cats ate th
*** 377,382 ****
--- 379,401 ----
      then <literal>&amp;</literal>, then <literal>&lt;-&gt;</literal>,
      and <literal>!</literal> most tightly.
     </para>
+ 
+    <para>
+     It's worth noticing that the AND/OR/NOT operators mean something subtly
+     different when they are within the arguments of a FOLLOWED BY operator
+     than when they are not, because then the position of the match is
+     significant.  Normally, <literal>!x</> matches only documents that do not
+     contain <literal>x</> anywhere.  But <literal>x &lt;-&gt; !y</>
+     matches <literal>x</> if it is not immediately followed by <literal>y</>;
+     an occurrence of <literal>y</> elsewhere in the document does not prevent
+     a match.  Another example is that <literal>x &amp; y</> normally only
+     requires that <literal>x</> and <literal>y</> both appear somewhere in the
+     document, but <literal>(x &amp; y) &lt;-&gt; z</> requires <literal>x</>
+     and <literal>y</> to match at the same place, immediately before
+     a <literal>z</>.  Thus this query behaves differently from <literal>x
+     &lt;-&gt; z &amp; y &lt;-&gt; z</>, which would match a document
+     containing two separate sequences <literal>x z</> and <literal>y z</>.
+    </para>
    </sect2>
  
    <sect2 id="textsearch-intro-configurations">
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to