Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-26 Thread Teodor Sigaev

There are some previously unnoticed errors in docs (master branch), this patch
fixes them.

Thank you, pushed

--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-21 Thread Dmitry Ivanov
There are some previously unnoticed errors in docs (master branch), this patch 
fixes them.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/textsearch.sgml b/doc/src/sgml/textsearch.sgml
index 930c8f0..78eaf74 100644
--- a/doc/src/sgml/textsearch.sgml
+++ b/doc/src/sgml/textsearch.sgml
@@ -1523,34 +1523,6 @@ SELECT tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10);
 
  
  
-  setweight
- 
-
-  setweight(query tsquery, weight "char") returns tsquery
- 
-
- 
-  
-   setweight returns a copy of the input query in which every
-   position has been labeled with the given weight(s), either
-   A, B, C,
-   D or their combination. These labels are retained when
-   queries are concatenated, allowing words from different parts of a document
-   to be weighted differently by ranking functions.
-  
-
-  
-   Note that weight labels apply to positions, not
-   lexemes.  If the input query has been stripped of
-   positions then setweight does nothing.
-  
- 
-
-
-
-
- 
- 
   numnode
  
 
@@ -2588,7 +2560,7 @@ more sample word(s) : more indexed word(s)
 

 Specific stop words recognized by the subdictionary cannot be
-specified;  instead use - to mark the location where any
+specified;  instead use ? to mark the location where any
 stop word can appear.  For example, assuming that a and
 the are stop words according to the subdictionary:
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-06 Thread Dmitry Ivanov
> > It seems that everything is settled now, so here's the patch introducing
> > the '<->' and '' operators. I've made the necessary changes to docs &
> > regression tests.
> 
> I noticed that I had accidently trimmed whitespaces in docs, this is a
> better one.

After a brief but reasonable discussion with Teodor I've come up with a more 
proper piece of code for phrase operator parsing. The patch is included below.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-06 Thread Dmitry Ivanov
> > > It seems that everything is settled now, so here's the patch introducing
> > > the '<->' and '' operators. I've made the necessary changes to docs &
> > > regression tests.
> > 
> > I noticed that I had accidently trimmed whitespaces in docs, this is a
> > better one.
> 
> After a brief but reasonable discussion with Teodor I've come up with a more
> proper piece of code for phrase operator parsing. The patch is included
> below.

Attached the patch. Sorry for the inconvenience.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where query @> keyword;
@@ -538,9 +538,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-05 Thread Dmitry Ivanov
> It seems that everything is settled now, so here's the patch introducing the
> '<->' and '' operators. I've made the necessary changes to docs &
> regression tests.

I noticed that I had accidently trimmed whitespaces in docs, this is a better 
one.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where query @> keyword;
@@ -538,9 +538,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where query @> keyword;
-   rewrite

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Oleg Bartunov
On Fri, Apr 1, 2016 at 5:24 PM, Tom Lane  wrote:

> Alvaro Herrera  writes:
> > Teodor Sigaev wrote:
> >> may be <=>? it isn't used anywhere yet.
> >>
> >> select 'fat'::tsquery <=> 'cat';
> >> select 'fat <=> cat'::tsquery;
> >> select 'fat <3> cat'::tsqyery; -- for non-default distance.
>
> > Dunno.  That looks pretty "relationalish".
>
> The existing precedent in the geometric types is to use <->
> as an operator denoting distance.  Perhaps <-> amd <3>
> would work here.
>
>
looks like we get consensus about arrows.


> regards, tom lane
>


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Tom Lane
Alvaro Herrera  writes:
> Teodor Sigaev wrote:
>> may be <=>? it isn't used anywhere yet.
>> 
>> select 'fat'::tsquery <=> 'cat';
>> select 'fat <=> cat'::tsquery;
>> select 'fat <3> cat'::tsqyery; -- for non-default distance.

> Dunno.  That looks pretty "relationalish".

The existing precedent in the geometric types is to use <->
as an operator denoting distance.  Perhaps <-> amd <3>
would work here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Andreas Joseph Krogh
På fredag 01. april 2016 kl. 15:22:55, skrev Teodor Sigaev >:
> there was a character that was very similar to dots I would suggest
 > that.  The closest is * I think, so what do you think of "***"?

 And join opertator for tsqueries is the same :
 select 'fat'::tsquery *** 'cat'; ?

 Single '*' ?  That's close to regex, any number of tokens. And it saves rules
 about duplicating character.

 select 'fat'::tsquery ** 'cat';
 select 'fat * cat'::tsquery;
 select 'fat * [3] cat'::tsqyery; -- for non-default distance.
 
What about ~> ?
 
-- Andreas Joseph Krogh




Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Teodor Sigaev

there was a character that was very similar to dots I would suggest
that.  The closest is * I think, so what do you think of "***"?


And join opertator for tsqueries is the same :
select 'fat'::tsquery *** 'cat'; ?

Single '*' ?  That's close to regex, any number of tokens. And it saves rules 
about duplicating character.


select 'fat'::tsquery ** 'cat';
select 'fat * cat'::tsquery;
select 'fat * [3] cat'::tsqyery; -- for non-default distance.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Alvaro Herrera
Teodor Sigaev wrote:
> >Well, I noticed that the docs talk about an operator that can be used in
> >SQL (outside the tsquery parser), as well as an operator that can be
> Just to join 2 tsquery with operator FOLLOWED BY

Ok.

> >used inside tsquery.  Inside tsquery anything would be usable, but
> >outside that it would be good to keep in mind the rest of SQL operators;
> >and <> means "different from", so using it for FOLLOWED BY seems odd to
> >me.
> previous rule was: duplicate operation character to join tsqueries.
> Ellipsis is forbidden to use as operator name:
> #  create operator ... (procedure = 'int2and');
> ERROR:  syntax error at or near ".."
> LINE 1: create operator ... (procedure = 'int2and');
> What is your suggestion for joining operator name?

Silly me ...  The dot is not allowed in operator names :-(  Sorry.  If
there was a character that was very similar to dots I would suggest
that.  The closest is * I think, so what do you think of "***"?

> >My suggestion of ... (ellipsis) is because that's already known as
> >related to text, used for omissions of words, where the surrounding
> >words form a phrase.  It seems much more natural to me.
> Yes, agree for omission. But for reason above its'n a good name although I
> don't have a strong objections.
> 
> may be <=>? it isn't used anywhere yet.
> 
> select 'fat'::tsquery <=> 'cat';
> select 'fat <=> cat'::tsquery;
> select 'fat <3> cat'::tsqyery; -- for non-default distance.

Dunno.  That looks pretty "relationalish".

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Teodor Sigaev

Well, I noticed that the docs talk about an operator that can be used in
SQL (outside the tsquery parser), as well as an operator that can be

Just to join 2 tsquery with operator FOLLOWED BY


used inside tsquery.  Inside tsquery anything would be usable, but
outside that it would be good to keep in mind the rest of SQL operators;
and <> means "different from", so using it for FOLLOWED BY seems odd to
me.

previous rule was: duplicate operation character to join tsqueries.
Ellipsis is forbidden to use as operator name:
#  create operator ... (procedure = 'int2and');
ERROR:  syntax error at or near ".."
LINE 1: create operator ... (procedure = 'int2and');
What is your suggestion for joining operator name?


My suggestion of ... (ellipsis) is because that's already known as
related to text, used for omissions of words, where the surrounding
words form a phrase.  It seems much more natural to me.
Yes, agree for omission. But for reason above its'n a good name although I don't 
have a strong objections.


may be <=>? it isn't used anywhere yet.

select 'fat'::tsquery <=> 'cat';
select 'fat <=> cat'::tsquery;
select 'fat <3> cat'::tsqyery; -- for non-default distance.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Alvaro Herrera
Oleg Bartunov wrote:
> On Thu, Mar 31, 2016 at 9:14 PM, Alvaro Herrera 
> wrote:
> 
> > What led you to choose the ? operator for the FOLLOWED BY semantics?
> > It doesn't seem a terribly natural choice -- most other things seems to
> > use ? as some sort of wildcard.  What about something like "...", so you
> > would do
> >   SELECT q @@ to_tsquery('fatal ... error');
> > and
> >   SELECT q @@ (tsquery 'fatal' ... tsquery 'error');
> >
> >
> originally was $, but then we change it to ?, we don't remember why. During
> warming-up this morning we came to other suggestion
> 
> SELECT q @@ to_tsquery('fatal <> error');
> and
> SELECT q @@ to_tsquery('fatal <2> error');
> 
> How about this ?

Well, I noticed that the docs talk about an operator that can be used in
SQL (outside the tsquery parser), as well as an operator that can be
used inside tsquery.  Inside tsquery anything would be usable, but
outside that it would be good to keep in mind the rest of SQL operators;
and <> means "different from", so using it for FOLLOWED BY seems odd to
me.

My suggestion of ... (ellipsis) is because that's already known as
related to text, used for omissions of words, where the surrounding
words form a phrase.  It seems much more natural to me.

(I also noticed that you can specify a counter together with the
operator, for "at most N words apart", which is not going to work for
the SQL-level operator no matter what you choose.  I suppose that's not
considered a problem)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-04-01 Thread Oleg Bartunov
On Thu, Mar 31, 2016 at 9:14 PM, Alvaro Herrera 
wrote:

> What led you to choose the ? operator for the FOLLOWED BY semantics?
> It doesn't seem a terribly natural choice -- most other things seems to
> use ? as some sort of wildcard.  What about something like "...", so you
> would do
>   SELECT q @@ to_tsquery('fatal ... error');
> and
>   SELECT q @@ (tsquery 'fatal' ... tsquery 'error');
>
>
originally was $, but then we change it to ?, we don't remember why. During
warming-up this morning we came to other suggestion

SELECT q @@ to_tsquery('fatal <> error');
and
SELECT q @@ to_tsquery('fatal <2> error');

How about this ?



> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-31 Thread Alvaro Herrera
What led you to choose the ? operator for the FOLLOWED BY semantics?
It doesn't seem a terribly natural choice -- most other things seems to
use ? as some sort of wildcard.  What about something like "...", so you
would do
  SELECT q @@ to_tsquery('fatal ... error');
and 
  SELECT q @@ (tsquery 'fatal' ... tsquery 'error');

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-31 Thread Dmitry Ivanov
Hi Teodor,

I've looked through your version and made a few adjustments.

> Pls, remove tsquery_setweight from patch because it's not a part of phrase
> search and it isn't mentioned in docs. Please, make a separate patch for it.

I've removed tsquery_setweight as you requested. I'm going to add it to the 
following commitfest later.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where query @> keyword;
@@ -538,9 +538,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where query @> 

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-31 Thread Teodor Sigaev
Looking at patch, I'm inlined to commit it in current commitfest, it looks 
workable and too many people desire it. I've did some changes, mostly in 
formatting and comments.


Patch makes some user-visible changes:
1 change order for tsquery. Assume, it's acceptable, only a few users store 
tsquery in table and have a Btree index. They will need to reindex.

2 less number of parenthesis in tsquery output, and tsquery becomes more 
readable.

Pls, remove tsquery_setweight from patch because it's not a part of phrase 
search and it isn't mentioned in docs. Please, make a separate patch for it.



Dmitry Ivanov wrote:

Sorry for the delay, I desperately needed some time to finish a bunch of
dangling tasks.

I've added some new comments and clarified the ones that were obscure.
Moreover, I felt an urge to recheck most parts of the code since apparently
nobody (besides myself) has gone so far yet.

On 25.03.16 18:42 MSK, David Steele wrote:

Time is short and it's not encouraging that you say there is "still much

work to be done".

Indeed, I was inaccurate. I am more than interested in the positive outcome.






--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


phrase-0.4.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-28 Thread Alexander Korotkov
Hi!

On Sat, Mar 26, 2016 at 12:02 AM, David Steele  wrote:

> On 3/25/16 3:54 PM, Artur Zakirov wrote:
> > On 25.03.2016 21:42, Dmitry Ivanov wrote:
> >> Sorry for the delay, I desperately needed some time to finish a bunch of
> >> dangling tasks.
> >>
> >> I've added some new comments and clarified the ones that were obscure.
> >> Moreover, I felt an urge to recheck most parts of the code since
> >> apparently
> >> nobody (besides myself) has gone so far yet.
> >>
> >> On 25.03.16 18:42 MSK, David Steele wrote:
> >>> Time is short and it's not encouraging that you say there is "still
> much
> >> work to be done".
> >>
> >> Indeed, I was inaccurate. I am more than interested in the positive
> >> outcome.
> >>
> >
> > I tested the patch and take a look on it. All regression tests passed.
> > The code looks good and the patch introduce a great functionality.
> >
> > I think the patch can be marked as "Ready for Commiter". But I do not
> > feel the force to do that myself.
> >
> > Also I agree with you about tsvector_setweight(). There is not a problem
> > with it because this weights are immutable and so there is not benefits
> > from new function.
>
> Ideally Alexander can also look at it.  If not, then you should mark it
> "ready for committer" since I doubt any more reviewers will sign on this
> late in the CF.
>

I've checked the last version of the patch. Patch applies cleanly on
master, builds without warnings, regression tests pass.
The issue I reported about tsquery textual representation is fixed.
I'm going to mark this "Ready for Committer".

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread David Steele
On 3/25/16 3:54 PM, Artur Zakirov wrote:
> On 25.03.2016 21:42, Dmitry Ivanov wrote:
>> Sorry for the delay, I desperately needed some time to finish a bunch of
>> dangling tasks.
>>
>> I've added some new comments and clarified the ones that were obscure.
>> Moreover, I felt an urge to recheck most parts of the code since
>> apparently
>> nobody (besides myself) has gone so far yet.
>>
>> On 25.03.16 18:42 MSK, David Steele wrote:
>>> Time is short and it's not encouraging that you say there is "still much
>> work to be done".
>>
>> Indeed, I was inaccurate. I am more than interested in the positive
>> outcome.
>>
> 
> I tested the patch and take a look on it. All regression tests passed.
> The code looks good and the patch introduce a great functionality.
> 
> I think the patch can be marked as "Ready for Commiter". But I do not
> feel the force to do that myself.
> 
> Also I agree with you about tsvector_setweight(). There is not a problem
> with it because this weights are immutable and so there is not benefits
> from new function.

Ideally Alexander can also look at it.  If not, then you should mark it
"ready for committer" since I doubt any more reviewers will sign on this
late in the CF.

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread Artur Zakirov

On 25.03.2016 21:42, Dmitry Ivanov wrote:

Sorry for the delay, I desperately needed some time to finish a bunch of
dangling tasks.

I've added some new comments and clarified the ones that were obscure.
Moreover, I felt an urge to recheck most parts of the code since apparently
nobody (besides myself) has gone so far yet.

On 25.03.16 18:42 MSK, David Steele wrote:

Time is short and it's not encouraging that you say there is "still much

work to be done".

Indeed, I was inaccurate. I am more than interested in the positive outcome.



Hi,

Thank you for your work!

I tested the patch and take a look on it. All regression tests passed. 
The code looks good and the patch introduce a great functionality.


I think the patch can be marked as "Ready for Commiter". But I do not 
feel the force to do that myself.


Also I agree with you about tsvector_setweight(). There is not a problem 
with it because this weights are immutable and so there is not benefits 
from new function.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread David Steele
On 3/25/16 2:42 PM, Dmitry Ivanov wrote:
> Sorry for the delay, I desperately needed some time to finish a bunch of 
> dangling tasks.
> 
> I've added some new comments and clarified the ones that were obscure. 
> Moreover, I felt an urge to recheck most parts of the code since apparently 
> nobody (besides myself) has gone so far yet.
> 
> On 25.03.16 18:42 MSK, David Steele wrote:
>> Time is short and it's not encouraging that you say there is "still much 
> work to be done".
> 
> Indeed, I was inaccurate. I am more than interested in the positive outcome.

Me, too!  I have set this to "needs review".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread Dmitry Ivanov
Sorry for the delay, I desperately needed some time to finish a bunch of 
dangling tasks.

I've added some new comments and clarified the ones that were obscure. 
Moreover, I felt an urge to recheck most parts of the code since apparently 
nobody (besides myself) has gone so far yet.

On 25.03.16 18:42 MSK, David Steele wrote:
> Time is short and it's not encouraging that you say there is "still much 
work to be done".

Indeed, I was inaccurate. I am more than interested in the positive outcome.

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where query @> keyword;
@@ -538,9 +538,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, 

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread Alexander Korotkov
On Fri, Mar 25, 2016 at 6:42 PM, David Steele  wrote:

> On 3/16/16 12:38 PM, Dmitry Ivanov wrote:
>
> I've made an attempt to fix some of the issues you've listed, although
>> there's
>> still much work to be done. I'll add some comments later.
>>
>
> Do you know when you'll have a chance to respond to reviews and provide a
> new patch?
>
> Time is short and it's not encouraging that you say there is "still much
> work to be done".  Perhaps it would be best to mark this "returned with
> feedback"?
>

Dmitry will provide a new version of patch today.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-25 Thread David Steele

Hi Dmitry,

On 3/16/16 12:38 PM, Dmitry Ivanov wrote:


I've made an attempt to fix some of the issues you've listed, although there's
still much work to be done. I'll add some comments later.


Do you know when you'll have a chance to respond to reviews and provide 
a new patch?


Time is short and it's not encouraging that you say there is "still much 
work to be done".  Perhaps it would be best to mark this "returned with 
feedback"?


Thanks,
--
-David
da...@pgmasters.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-21 Thread Alexander Korotkov
Hi!

I see that patch changes existing regression tests in tsearch2.out.

*** a/contrib/tsearch2/expected/tsearch2.out
--- b/contrib/tsearch2/expected/tsearch2.out
*** SELECT '(!1|2)&3'::tsquery;
*** 278,292 
  (1 row)

  SELECT '1|(2|(4|(5|6)))'::tsquery;
!  tsquery
! -
!  '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
  (1 row)

  SELECT '1|2|4|5|6'::tsquery;
!  tsquery
! -
!  ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
  (1 row)

  SELECT '1&(2&(4&(5&6)))'::tsquery;
--- 278,292 
  (1 row)

  SELECT '1|(2|(4|(5|6)))'::tsquery;
!tsquery
! -
!  '1' | '2' | '4' | '5' | '6'
  (1 row)

  SELECT '1|2|4|5|6'::tsquery;
!tsquery
! -
!  '1' | '2' | '4' | '5' | '6'
  (1 row)


This change looks like improvement, without braces tsquery readability is
much better.

*** select rewrite('moscow & hotel', 'select
*** 461,469 
  (1 row)

  select rewrite('bar &  new & qq & foo & york', 'select keyword, sample
from test_tsquery'::text );
!rewrite
!
-
!  'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' &
'york' ) )
  (1 row)

  select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
--- 461,469 
  (1 row)

  select rewrite('bar &  new & qq & foo & york', 'select keyword, sample
from test_tsquery'::text );
!  rewrite
!
-
!  ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' |
'qq' )
  (1 row)

  select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;

However, such reorderings look unclear and need motivation.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-21 Thread Artur Zakirov
I tried to find some bugs in the code. I can't find them. But it does 
not mean that there are not bugs.


Still there are a lack of comments and trailing whitespaces.

On 16.03.2016 19:38, Dmitry Ivanov wrote:

Hi, Artur

I've made an attempt to fix some of the issues you've listed, although there's
still much work to be done. I'll add some comments later.


This function has the duplicated piece from the tsvector_setweight()
from tsvector_op.c. You can define new function for it.


I'm not sure it's worth the trouble. IMO these functions are relatively small
and we won't benefit from extracting the duplicate code.


I think that a separate function would be better. These weights are 
occurred in other functions. But I might be wrong and maybe I cavil at 
the code too much.





These functions was removed in 9acb9007de30b3daaa9efc16763c3bc6e3e0a92d.
It seems that tsvector_op.c was not synchronized with the master.


Got it, thanks!




--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-20 Thread David Steele
On 2/29/16 10:33 AM, Artur Zakirov wrote:

> Conclusion
> --
> 
> This patch is large and it needs more research. I will be reviewing it
> and will give another notes later.

This thread has not had a response from the authors or new patch in more
than two weeks.  Please post something soon or it will be marked
"returned with feedback".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-03-19 Thread Dmitry Ivanov
Hi, Artur

I've made an attempt to fix some of the issues you've listed, although there's 
still much work to be done. I'll add some comments later.

> This function has the duplicated piece from the tsvector_setweight()
> from tsvector_op.c. You can define new function for it.

I'm not sure it's worth the trouble. IMO these functions are relatively small 
and we won't benefit from extracting the duplicate code.

> These functions was removed in 9acb9007de30b3daaa9efc16763c3bc6e3e0a92d.
> It seems that tsvector_op.c was not synchronized with the master.

Got it, thanks!

-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/contrib/tsearch2/expected/tsearch2.out b/contrib/tsearch2/expected/tsearch2.out
index 972f764..97379e7 100644
--- a/contrib/tsearch2/expected/tsearch2.out
+++ b/contrib/tsearch2/expected/tsearch2.out
@@ -278,15 +278,15 @@ SELECT '(!1|2)&3'::tsquery;
 (1 row)
 
 SELECT '1|(2|(4|(5|6)))'::tsquery;
- tsquery 
--
- '1' | ( '2' | ( '4' | ( '5' | '6' ) ) )
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1|2|4|5|6'::tsquery;
- tsquery 
--
- ( ( ( '1' | '2' ) | '4' ) | '5' ) | '6'
+   tsquery   
+-
+ '1' | '2' | '4' | '5' | '6'
 (1 row)
 
 SELECT '1&(2&(4&(5&6)))'::tsquery;
@@ -340,7 +340,7 @@ select 'a' > 'b & c'::tsquery;
 select 'a | f' < 'b & c'::tsquery;
  ?column? 
 --
- t
+ f
 (1 row)
 
 select 'a | ff' < 'b & c'::tsquery;
@@ -443,9 +443,9 @@ select count(*) from test_tsquery where keyword >  'new & york';
 
 set enable_seqscan=on;
 select rewrite('foo & bar & qq & new & york',  'new & york'::tsquery, 'big & apple | nyc | new & york & city');
- rewrite  
---
- 'foo' & 'bar' & 'qq' & ( 'city' & 'new' & 'york' | ( 'nyc' | 'big' & 'apple' ) )
+   rewrite
+--
+ 'foo' & 'bar' & 'qq' & ( 'nyc' | 'big' & 'apple' | 'city' & 'new' & 'york' )
 (1 row)
 
 select rewrite('moscow', 'select keyword, sample from test_tsquery'::text );
@@ -461,9 +461,9 @@ select rewrite('moscow & hotel', 'select keyword, sample from test_tsquery'::tex
 (1 row)
 
 select rewrite('bar &  new & qq & foo & york', 'select keyword, sample from test_tsquery'::text );
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY['moscow', keyword, sample] ) from test_tsquery;
@@ -479,9 +479,9 @@ select rewrite( ARRAY['moscow & hotel', keyword, sample] ) from test_tsquery;
 (1 row)
 
 select rewrite( ARRAY['bar &  new & qq & foo & york', keyword, sample] ) from test_tsquery;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select keyword from test_tsquery where keyword @> 'new';
@@ -520,9 +520,9 @@ select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('e
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'bar &  new & qq & foo & york') as query where keyword <@ query;
-   rewrite   
--
- 'citi' & 'foo' & ( 'bar' | 'qq' ) & ( 'nyc' | ( 'big' & 'appl' | 'new' & 'york' ) )
+ rewrite 
+-
+ ( 'nyc' | 'big' & 'appl' | 'new' & 'york' ) & 'citi' & 'foo' & ( 'bar' | 'qq' )
 (1 row)
 
 select rewrite( ARRAY[query, keyword, sample] ) from test_tsquery, to_tsquery('english', 'moscow') as query where query @> keyword;
@@ -538,9 +538,9 @@ 

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-29 Thread Artur Zakirov

Hello, Dmitry

This is my initial review for you patch. Below are my comments.

Introduction


This patch introduce new operator and new functions.
New operator:
- ??
New functions:
- phraseto_tsquery([ config regconfig , ] query text)
- setweight(tsquery, "char")
- tsquery_phrase(query1 tsquery, query2 tsquery)
- tsquery_phrase(query1 tsquery, query2 tsquery, distance integer)

New regression tests are included in the patch. Information about new 
features is provided in the documentation.


Initial Run
---

The patch applies correctly to HEAD. Regression tests pass successfully, 
without errors. It seems that the patch work as you expected.


Performance
---

I have not tested possible performance issues yet. Maybe later I will 
prepare some data to test performance.


Coding
--

I agree with others that there is a lack of comments for new functions. 
Most of them without comments.



../pg_patches/phrase_search.patch:1086: trailing whitespace.
 * QI_VALSTOP nodes should be cleaned and
../pg_patches/phrase_search.patch:1124: trailing whitespace.
if (priority < parentPriority)
../pg_patches/phrase_search.patch:1140: trailing whitespace.
if (priority < parentPriority)
../pg_patches/phrase_search.patch:1591: space before tab in indent.
/*  (a|b) ? c => (a?c) | (b?c) */
../pg_patches/phrase_search.patch:1603: space before tab in indent.
/*  !a ? b=> b & !(a?b) */


It is git apply output. There are trailing whitespaces in the code.


+typedef struct MorphOpaque
+{
+   Oid cfg_id;
+   int qoperator;  /* query operator */
+} MorphOpaque;


Maybe you should move structure definition to the top of the to_tsany.c


+   pushValue(state,
+ prs.words[count].word,
+ prs.words[count].len,
+ weight,
+ ((prs.words[count].flags 
& TSL_PREFIX) || prefix) ?
+   true :
+   false);


There is not need in ternary operator here. You can write:

pushValue(state,
  prs.words[count].word,
  prs.words[count].len,
  weight,
  (prs.words[count].flags & TSL_PREFIX) || prefix);


 /*
+ * Wrapper of check condition function for TS_execute.
+ * We are using the fact GIN_FALSE = 0 and GIN_MAYBE state
+ * should be treated as true, so, any non-zero value should be
+ * converted to boolean true.
+ */
+static bool
+checkcondition_gin(void *checkval, QueryOperand *val, ExecPhraseData *data)
+{
+   return !!checkcondition_gin_internal((GinChkVal *) checkval, val, data);
+}


Maybe write like this?

static bool
checkcondition_gin(void *checkval, QueryOperand *val, ExecPhraseData *data)
{
	return checkcondition_gin_internal((GinChkVal *) checkval, val, data) 
!= GIN_FALSE;

}

Then you don't need in the comment above of the function.


+#define PRINT_PRIORITY(x)  ( (((QueryOperator*)(x))->oper == OP_NOT) ? 5 : 
QO_PRIORITY(x) )


What is mean "5" here? You can define a new value near the:

#define OP_OR   1
#define OP_AND  2
#define OP_NOT  3
#define OP_PHRASE   4


+Datum
+tsquery_setweight(PG_FUNCTION_ARGS)
+{
+   TSQuery in = PG_GETARG_TSQUERY(0);
+   charcw = PG_GETARG_CHAR(1);
+   TSQuery out;
+   QueryItem  *item;
+   int w = 0;
+
+   switch (cw)
+   {
+   case 'A':
+   case 'a':
+   w = 3;
+   break;
+   case 'B':
+   case 'b':
+   w = 2;
+   break;
+   case 'C':
+   case 'c':
+   w = 1;
+   break;
+   case 'D':
+   case 'd':
+   w = 0;
+   break;
+   default:
+   /* internal error */
+   elog(ERROR, "unrecognized weight: %d", cw);
+   }
+
+   out = (TSQuery) palloc(VARSIZE(in));
+   memcpy(out, in, VARSIZE(in));
+   item = GETQUERY(out);
+
+   while(item - GETQUERY(out) < out->size)   
+   {
+   if (item->type == QI_VAL)
+   item->qoperand.weight |= (1 << w);
+
+   item++;
+   }
+
+   PG_FREE_IF_COPY(in, 0);
+   PG_RETURN_POINTER(out);
+}


This function has the duplicated piece from the tsvector_setweight() 
from tsvector_op.c. You can define new function for it.



+/*
+ * Check if datatype is the specified 

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Robert Haas
On Tue, Feb 2, 2016 at 6:04 AM, Alvaro Herrera  wrote:
> Andreas Joseph Krogh wrote:
>> Which seems to indicate it has received a fair amount of testing and is quite
>> stable.
>> Hopefully it integrates into the 9.6 codebase without too much risk.
>
> Yes, yes, that's all very good, but we're nearing the closure of the 9.6
> development cycle and we only have one commitfest left.  If someone had
> lots of community brownie points because of doing lots of reviews of
> other people's patches, they might push their luck by posting this patch
> to the final commitfest.  But if that someone didn't, then it wouldn't
> be fair, and if I were the commitfest manager of that commitfest I would
> boot their patch to the 9.7-First commitfest.

Completely agreed.  Also, to be frank, these text search patches are
often in need of quite a LOT of work per line compared to some others.
For example, consider the percentage of this patch that is comments.
It's a pretty low percentage.  And it's going into surrounding code
that is also very low in comments.  Giving this a good review will
take somebody a lot of time, and bringing it up to PostgreSQL's normal
standards will probably take quite a lot of work.  I don't understand
why the community should agree to do that for anyone, and as you say,
it's not like PostgresPro is leading the pack in terms of review
contributions.  We're not going to get this release out on time if
everybody insists that every patch has to go in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Oleg Bartunov
On Tue, Feb 2, 2016 at 10:18 AM, Andreas Joseph Krogh 
wrote:

> På tirsdag 02. februar 2016 kl. 04:22:57, skrev Michael Paquier <
> michael.paqu...@gmail.com>:
>
>
>
> On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov 
> wrote:
>>
>> This patch was originally developed by Teodor Sigaev and Oleg Bartunov in
>> 2009, so all credit goes to them. Any feedback is welcome.
>
> This is not a small patch:
> 28 files changed, 2441 insertions(+), 380 deletions(-)
> And the last CF of 9.6 should not contain rather large patches.
> --
> Michael
>
>
>
> OTOH; It would be extremely nice to get this into 9.6.
>

will see how community decided.
anyway, it's already in our distribution.



>
>
> --
> *Andreas Joseph Krogh*
>


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Andreas Joseph Krogh
På tirsdag 02. februar 2016 kl. 09:20:06, skrev Oleg Bartunov <
obartu...@gmail.com >:
    On Tue, Feb 2, 2016 at 10:18 AM, Andreas Joseph Krogh > wrote: På tirsdag 02. februar 2016 kl. 04:22:57, 
skrev Michael Paquier >:
    On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov > wrote: This patch was originally developed by 
Teodor Sigaev and Oleg Bartunov in
 2009, so all credit goes to them. Any feedback is welcome. 

This is not a small patch:
 28 files changed, 2441 insertions(+), 380 deletions(-)
And the last CF of 9.6 should not contain rather large patches.
-- Michael


 
 


OTOH; It would be extremely nice to get this into 9.6.
 
will see how community decided.
anyway, it's already in our distribution.



 
 
Which seems to indicate it has received a fair amount of testing and is quite 
stable.
Hopefully it integrates into the 9.6 codebase without too much risk.
Thanks for contributing this.
 
-- Andreas Joseph Krogh




Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Andreas Joseph Krogh
På tirsdag 02. februar 2016 kl. 12:04:21, skrev Alvaro Herrera <
alvhe...@2ndquadrant.com >:
Andreas Joseph Krogh wrote:
   
 > Which seems to indicate it has received a fair amount of testing and is 
quite
 > stable.
 > Hopefully it integrates into the 9.6 codebase without too much risk.

 Yes, yes, that's all very good, but we're nearing the closure of the 9.6
 development cycle and we only have one commitfest left.  If someone had
 lots of community brownie points because of doing lots of reviews of
 other people's patches, they might push their luck by posting this patch
 to the final commitfest.  But if that someone didn't, then it wouldn't
 be fair, and if I were the commitfest manager of that commitfest I would
 boot their patch to the 9.7-First commitfest.

 The current commitfest which I'm trying to close still has 24 patches in
 needs-review state and 11 patches ready-for-committer; the next one (not
 closed yet) has 40 patches that will need review.  That means a total of
 75 patches, and those should all be processed ahead of this one.  The
 effort needed to process each of those patches is not trivial, and I'm
 sorry I have to say this but I don't see PostgresPro contributing enough
 reviews, even though I pinged a number of people there, so putting one
 more patch on the rest of the community's shoulders doesn't seem fair to
 me.

 Everybody has their favorite patch that they want in the next release,
 but we only have so much manpower to review and integrate those patches.
 All review help is welcome.
 
I understand completely.
 
-- Andreas Joseph Krogh




Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Alvaro Herrera
Andreas Joseph Krogh wrote:
  
> Which seems to indicate it has received a fair amount of testing and is quite 
> stable.
> Hopefully it integrates into the 9.6 codebase without too much risk.

Yes, yes, that's all very good, but we're nearing the closure of the 9.6
development cycle and we only have one commitfest left.  If someone had
lots of community brownie points because of doing lots of reviews of
other people's patches, they might push their luck by posting this patch
to the final commitfest.  But if that someone didn't, then it wouldn't
be fair, and if I were the commitfest manager of that commitfest I would
boot their patch to the 9.7-First commitfest.

The current commitfest which I'm trying to close still has 24 patches in
needs-review state and 11 patches ready-for-committer; the next one (not
closed yet) has 40 patches that will need review.  That means a total of
75 patches, and those should all be processed ahead of this one.  The
effort needed to process each of those patches is not trivial, and I'm
sorry I have to say this but I don't see PostgresPro contributing enough
reviews, even though I pinged a number of people there, so putting one
more patch on the rest of the community's shoulders doesn't seem fair to
me.

Everybody has their favorite patch that they want in the next release,
but we only have so much manpower to review and integrate those patches.
All review help is welcome.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-02 Thread Oleg Bartunov
On Tue, Feb 2, 2016 at 2:04 PM, Alvaro Herrera 
wrote:

> Andreas Joseph Krogh wrote:
>
> > Which seems to indicate it has received a fair amount of testing and is
> quite
> > stable.
> > Hopefully it integrates into the 9.6 codebase without too much risk.
>
> Yes, yes, that's all very good, but we're nearing the closure of the 9.6
> development cycle and we only have one commitfest left.  If someone had
> lots of community brownie points because of doing lots of reviews of
> other people's patches, they might push their luck by posting this patch
> to the final commitfest.  But if that someone didn't, then it wouldn't
> be fair, and if I were the commitfest manager of that commitfest I would
> boot their patch to the 9.7-First commitfest.
>
> The current commitfest which I'm trying to close still has 24 patches in
> needs-review state and 11 patches ready-for-committer; the next one (not
> closed yet) has 40 patches that will need review.  That means a total of
> 75 patches, and those should all be processed ahead of this one.  The
> effort needed to process each of those patches is not trivial, and I'm
> sorry I have to say this but I don't see PostgresPro contributing enough
> reviews, even though I pinged a number of people there, so putting one
> more patch on the rest of the community's shoulders doesn't seem fair to
> me.
>

I'll talk about this.


>
> Everybody has their favorite patch that they want in the next release,
> but we only have so much manpower to review and integrate those patches.
> All review help is welcome.
>
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


[HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-01 Thread Dmitry Ivanov
Hi Hackers,

Although PostgreSQL is capable of performing some FTS (full text search) 
queries, there's still a room for improvement. Phrase search support could 
become a great addition to the existing set of features.


Introduction


It is no secret that one can make Google search for an exact phrase (instead 
of an unordered lexeme set) simply by enclosing it within double quotes. This 
is a really nice feature which helps to save the time that would otherwise be 
spent on annoying result filtering.

One weak spot of the current FTS implementation is that there is no way to 
specify the desired lexeme order (since it would not make any difference at 
all). In the end, the search engine will look for each lexeme individually, 
which means that a hypothetical end user would have to discard documents not 
including search phrase all by himself. This problem is solved by the patch 
below (should apply cleanly to 61ce1e8f1).


Problem description
===

The problem comes from the lack of lexeme ordering operator. Consider the 
following example:

select q @@ plainto_tsquery('fatal error') from 
unnest(array[to_tsvector('fatal error'), to_tsvector('error is not fatal')]) 
as q;
?column?
--
t
t
(2 rows)

Clearly the latter match is not the best result in case we wanted to find 
exactly the "fatal error" phrase. That's when the need for a lexeme ordering 
operator arises:

select q @@ to_tsquery('fatal ? error') from unnest(array[to_tsvector('fatal 
error'), to_tsvector('error is not fatal')]) as q;
?column?
--
t
f
(2 rows)


Implementation
==

The ? (FOLLOWED BY) binary operator takes form of "?" or "?[N]" where 0 <= N < 
~16K. If N is provided, the distance between left and right operands must be 
no greater that N. For example:

select to_tsvector('postgres has taken severe damage') @@ to_tsquery('postgres 
? (severe ? damage)');
 ?column?
--
 f
(1 row)

select to_tsvector('postgres has taken severe damage') @@ to_tsquery('postgres 
?[4] (severe ? damage)');
 ?column?
--
 t
(1 row)

New function phraseto_tsquery([ regconfig, ] text) takes advantage of the "?
[N]" operator in order to facilitate phrase search:

select to_tsvector('postgres has taken severe damage') @@ 
phraseto_tsquery('severely damaged');
 ?column?
--
 t
(1 row)


This patch was originally developed by Teodor Sigaev and Oleg Bartunov in 
2009, so all credit goes to them. Any feedback is welcome.


-- 
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Companydiff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index f70cfe7..9b0a703 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -3925,8 +3925,9 @@ SELECT to_tsvector('english', 'The Fat Rats');
 
  A tsquery value stores lexemes that are to be
  searched for, and combines them honoring the Boolean operators
-  (AND), | (OR), and
- ! (NOT).  Parentheses can be used to enforce grouping
+  (AND), | (OR),
+ ! (NOT) and ? (FOLLOWED BY) phrase search
+ operator.  Parentheses can be used to enforce grouping
  of the operators:
 
 
@@ -3947,8 +3948,8 @@ SELECT 'fat  rat  ! cat'::tsquery;
 
 
  In the absence of parentheses, ! (NOT) binds most tightly,
- and  (AND) binds more tightly than
- | (OR).
+ and  (AND) and ? (FOLLOWED BY)
+ both bind more tightly than | (OR).
 
 
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 139aa2b..7585f6c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9060,6 +9060,12 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
 !'cat'


+ ?? 
+tsquery followed by tsquery
+to_tsquery('fat') ?? to_tsquery('rat')
+'fat' ? 'rat'
+   
+   
  @ 
 tsquery contains another ?
 'cat'::tsquery @ 'cat  rat'::tsquery
@@ -9154,6 +9160,18 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple

 
  
+  phraseto_tsquery
+ 
+ phraseto_tsquery( config regconfig ,  query text)
+
+tsquery
+produce tsquery ignoring punctuation
+phraseto_tsquery('english', 'The Fat Rats')
+'fat' ? 'rat'
+   
+   
+
+ 
   querytree
  
  querytree(query tsquery)
@@ -9177,6 +9195,15 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple


 
+ setweight(tsquery, "char")
+
+tsquery
+add weight to each element of tsquery
+setweight('fat ? cat  rat:B'::tsquery, 'A')
+( 'fat':A ? 'cat':A )  'rat':AB
+   
+   
+
  
   strip
  
@@ -9269,6 +9296,27 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple

 
  

Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-01 Thread Andreas Joseph Krogh
På tirsdag 02. februar 2016 kl. 04:22:57, skrev Michael Paquier <
michael.paqu...@gmail.com >:
    On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov > wrote: This patch was originally developed by 
Teodor Sigaev and Oleg Bartunov in
 2009, so all credit goes to them. Any feedback is welcome. 

This is not a small patch:
 28 files changed, 2441 insertions(+), 380 deletions(-)
And the last CF of 9.6 should not contain rather large patches.
-- Michael


 
 
OTOH; It would be extremely nice to get this into 9.6.
 
 
-- Andreas Joseph Krogh




Re: [HACKERS] [PATCH] Phrase search ported to 9.6

2016-02-01 Thread Michael Paquier
On Mon, Feb 1, 2016 at 8:21 PM, Dmitry Ivanov 
wrote:

> This patch was originally developed by Teodor Sigaev and Oleg Bartunov in
> 2009, so all credit goes to them. Any feedback is welcome.
>

This is not a small patch:
28 files changed, 2441 insertions(+), 380 deletions(-)
And the last CF of 9.6 should not contain rather large patches.
-- 
Michael