Re: [HACKERS] new function for tsquery creartion

2017-10-13 Thread Alexey Chernyshov
Hi all,
I am extending phrase operator  is such way that it will have <n,m>
syntax that means from n to m words, so I will use such syntax (<n,m>)
further. I found that a AROUND(N) b is exactly the same as a <-N,N> b
and it can be replaced while parsing. So, what do you think of such
idea? In this patch I have noticed some unobvious behavior.

# select to_tsvector('Hello, cat world!') @@ queryto_tsquery('cat
AROUND(1) cat') as match;
match 
---
 t

cat AROUND(1) cat is the same is "cat <1> cat || cat <0> cat" and:

# select to_tsvector('Hello, cat world!') @@ to_tsquery('cat <0> cat');
 ?column? 
---
 t

It seems to be a proper logic behavior but it is a possible pitfall,
maybe it should be documented?

But more important question is how AROUND() operator should handle stop
words? Now it works as:

# select queryto_tsquery('cat <2> (a AROUND(10) rat)');
 queryto_tsquery  
--
 'cat' <12> 'rat'
(1 row)

# select queryto_tsquery('cat <2> a AROUND(10) rat');
queryto_tsquery 

 'cat' AROUND(12) 'rat'
(1 row)

In my opinion it should be like:
cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-8,12>
rat 

cat <2> a AROUND(10) rat == cat <2,2> a <-10,10> rat = cat <-8, 12>
rat

Now <n,m> operator can be replaced with combination of phrase
operator , AROUND(), and logical operators, but with <n,m> operator
it will be much painless. Correct me, please, if I am wrong.

-- 
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-09-14 Thread Alexey Chernyshov
On Tue, 12 Sep 2017 12:59:20 -0400
Tom Lane <t...@sss.pgh.pa.us> wrote:

> Quick comment on this patch: recently, we've decided that having
> patches replace the whole base script for an extension is too much of
> a maintenance problem, especially when there are several patches in
> the pipeline for the same contrib module.  The new style is to
> provide only a version update script (which you'd have to write
> anyway), and then rely on CREATE EXTENSION to apply the old base
> script plus the update(s). You can see some examples in the patch I
> just posted at
> 
> https://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us
> 
> Also, since that patch is probably going to get committed pretty
> soon, you could reformulate your patch as an add-on to its
> citext--1.4--1.5.sql script.  We don't really need to have a separate
> version of the extension for states that are intermediate between two
> PG major releases.  Only if your patch doesn't get in by v11 freeze
> would you need to make it a separate citext--1.5--1.6.sql script.
> 
>   regards, tom lane

Accented characters and different length strings tests added.
Since patch
(ttps://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us)
is committed, I changed the patch as you said. Thanks for your notes.
Do we need expected/citext.out? It seems that only
expected/citext_1.out has correct output.

-- 
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 35aa8d7a1890a242cdfb3bed6cabaa3b39766f1f Mon Sep 17 00:00:00 2001
From: Alexey Chernyshov <a.chernys...@postgrespro.ru>
Date: Tue, 18 Jul 2017 13:50:19 +0300
Subject: [PATCH] patch applied

---
 contrib/citext/citext--1.4--1.5.sql  |  74 +++
 contrib/citext/citext.c  | 120 
 contrib/citext/expected/citext.out   | 370 +++
 contrib/citext/expected/citext_1.out | 370 +++
 contrib/citext/sql/citext.sql|  78 
 5 files changed, 1012 insertions(+)

diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql
index 97942cb..5ae522b 100644
--- a/contrib/citext/citext--1.4--1.5.sql
+++ b/contrib/citext/citext--1.4--1.5.sql
@@ -12,3 +12,77 @@ ALTER OPERATOR >= (citext, citext) SET (
 RESTRICT   = scalargesel,
 JOIN   = scalargejoinsel
 );
+
+CREATE FUNCTION citext_pattern_lt( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_le( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_gt( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE FUNCTION citext_pattern_ge( citext, citext )
+RETURNS bool
+AS 'MODULE_PATHNAME'
+LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE;
+
+CREATE OPERATOR ~<~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~>=~,
+COMMUTATOR = ~>~,
+PROCEDURE  = citext_pattern_lt,
+RESTRICT   = scalarltsel,
+JOIN   = scalarltjoinsel
+);
+
+CREATE OPERATOR ~<=~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~>~,
+COMMUTATOR = ~>=~,
+PROCEDURE  = citext_pattern_le,
+RESTRICT   = scalarltsel,
+JOIN   = scalarltjoinsel
+);
+
+CREATE OPERATOR ~>=~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~<~,
+COMMUTATOR = ~<=~,
+PROCEDURE  = citext_pattern_ge,
+RESTRICT   = scalargtsel,
+JOIN   = scalargtjoinsel
+);
+
+CREATE OPERATOR ~>~ (
+LEFTARG= CITEXT,
+RIGHTARG   = CITEXT,
+NEGATOR= ~<=~,
+COMMUTATOR = ~<~,
+PROCEDURE  = citext_pattern_gt,
+RESTRICT   = scalargtsel,
+JOIN   = scalargtjoinsel
+);
+
+CREATE FUNCTION citext_pattern_cmp(citext, citext)
+RETURNS int4
+AS 'MODULE_PATHNAME'
+LANGUAGE C STRICT IMMUTABLE PARALLEL SAFE;
+
+CREATE OPERATOR CLASS citext_pattern_ops
+FOR TYPE CITEXT USING btree AS
+OPERATOR1   ~<~  (citext, citext),
+OPERATOR2   ~<=~ (citext, citext),
+OPERATOR3   =(citext, citext),
+OPERATOR4   ~>=~ (citext, citext),
+OPERATOR5   ~>~  (citext, citext),
+FUNCTION1   citext_pattern_cmp(citext, citext);
diff --git a/contrib/citext/citext.c b/contrib/citext/citext.c
index 0ba4782..189105a 100644
--- a/contrib/citext/citext.c
+++ b/contrib/citext/citext.c
@@ -18,6 +18,7 @@ PG_MODULE_MAGIC;
  */
 
 static int32 citextcmp(text *left, text *right, Oid collid);
+static int32 internal_citext_pattern_cmp(text *left, text *right, Oid collid);
 
 /*
  *		=
@@ -59,6 +60,40 @@ citextcmp(text *left, text *right, Oid collid)
 }
 
 /*
+ * citext_pattern_cmp()
+ * Internal character-by-character comparison function for c

Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel

2017-09-13 Thread Alexey Chernyshov
On Sat, 9 Sep 2017 13:53:35 +0530
Ashutosh Sharma <ashu.coe...@gmail.com> wrote:

> 
> Finally, i got some time to look into this patch and surprisingly I
> didn't find any function returning information at page level instead
> all the SQL functions are returning information at index level.
> Therefore, i too feel that it should be either integrated with
> pgstattuple which could a better match as Tomas also mentioned or may
> be add a new contrib module itself. I think, adding a new contrib
> module looks like a better option. The reason being, it doesn't simply
> include the function for showing index level statistics (for e.g..
> index size, no of rows, values..e.t.c) like pgstattuple does but,
> also, displays information contained in a page for e.g. the object
> stored in the page,  it's tid e.t.c. So, more or less, I would say
> that, it's the mixture of pageinspect and pgstattuple module and
> therefore, i feel, adding it as a new extension would be a better
> choice. Thought?

Thank you for your interest, I will add a new contrib module named,
say, indexstat. I think we can add statistics on other indexes in the
future.

-- 
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-07 Thread Alexey Chernyshov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

One thing I have noticed is a trailing whitespace after "bogus one":

123 +* If we don't have to fetch the tuple, just return a
124 +* bogus one 
125 +*/

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-09-04 Thread Alexey Chernyshov
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Hi Alexander,

make check-world fails on contrib/postgres_fdw because of Subquery Scan on ... 
Probably, query plan was changed.

The new status of this patch is: Waiting on Author

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-07-18 Thread Alexey Chernyshov

Hi all,

The attached patch introduces citext_pattern_ops for citext extension 
type like text_pattern_ops for text type. Here are operators ~<~, ~<=~, 
~>~, ~>=~ combined into citext_pattern_ops operator class. These 
operators simply compare underlying citext values as C strings with 
memcmp() function. This operator class isn’t supported by B-Tree index 
yet, but it is a first step to do it.


Patch includes regression tests and is applicable to the latest commit 
(c85ec643ff2586e2d144374f51f93bfa215088a2).


The problem of citext support for LIKE operator with B-Tree index was 
mentioned in [1]. Briefly, the planner doesn’t use B-Tree index for 
queries text_col LIKE ‘abc%’. I’d like to investigate how to improve it 
and make another patch. I think the start point is 
match_special_index_operator() function which doesn’t support custom 
types. I would appreciate hearing your opinion on this.


1. https://www.postgresql.org/message-id/3924.1480351187%40sss.pgh.pa.us

--
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 6ff180548209fb3abb66d70686df728b307635e3 Mon Sep 17 00:00:00 2001
From: Alexey Chernyshov <a.chernys...@postgrespro.ru>
Date: Fri, 23 Jun 2017 14:40:04 +0300
Subject: [PATCH 1/3] add citext opclass citext_pattern_ops

---
 contrib/citext/Makefile  |   2 +-
 contrib/citext/citext--1.4--1.5.sql  |   1 +
 contrib/citext/citext--1.4.sql   | 501 --
 contrib/citext/citext--1.5.sql   | 575 +++
 contrib/citext/citext.c  | 120 
 contrib/citext/citext.control|   2 +-
 contrib/citext/expected/citext.out   | 334 
 contrib/citext/expected/citext_1.out | 334 
 contrib/citext/sql/citext.sql|  72 +
 9 files changed, 1438 insertions(+), 503 deletions(-)
 create mode 100644 contrib/citext/citext--1.4--1.5.sql
 delete mode 100644 contrib/citext/citext--1.4.sql
 create mode 100644 contrib/citext/citext--1.5.sql

diff --git a/contrib/citext/Makefile b/contrib/citext/Makefile
index 563cd22..8474e86 100644
--- a/contrib/citext/Makefile
+++ b/contrib/citext/Makefile
@@ -3,7 +3,7 @@
 MODULES = citext
 
 EXTENSION = citext
-DATA = citext--1.4.sql citext--1.3--1.4.sql \
+DATA = citext--1.5.sql citext--1.4--1.5.sql citext--1.3--1.4.sql \
 	citext--1.2--1.3.sql citext--1.1--1.2.sql \
 	citext--1.0--1.1.sql citext--unpackaged--1.0.sql
 PGFILEDESC = "citext - case-insensitive character string data type"
diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql
new file mode 100644
index 000..4c8abc0
--- /dev/null
+++ b/contrib/citext/citext--1.4--1.5.sql
@@ -0,0 +1 @@
+/* contrib/citext/citext--1.4--1.5.sql */
diff --git a/contrib/citext/citext--1.4.sql b/contrib/citext/citext--1.4.sql
deleted file mode 100644
index 7b06198..000
--- a/contrib/citext/citext--1.4.sql
+++ /dev/null
@@ -1,501 +0,0 @@
-/* contrib/citext/citext--1.4.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION citext" to load this file. \quit
-
---
---  PostgreSQL code for CITEXT.
---
--- Most I/O functions, and a few others, piggyback on the "text" type
--- functions via the implicit cast to text.
---
-
---
--- Shell type to keep things a bit quieter.
---
-
-CREATE TYPE citext;
-
---
---  Input and output functions.
---
-CREATE FUNCTION citextin(cstring)
-RETURNS citext
-AS 'textin'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextout(citext)
-RETURNS cstring
-AS 'textout'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextrecv(internal)
-RETURNS citext
-AS 'textrecv'
-LANGUAGE internal STABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextsend(citext)
-RETURNS bytea
-AS 'textsend'
-LANGUAGE internal STABLE STRICT PARALLEL SAFE;
-
---
---  The type itself.
---
-
-CREATE TYPE citext (
-INPUT  = citextin,
-OUTPUT = citextout,
-RECEIVE= citextrecv,
-SEND   = citextsend,
-INTERNALLENGTH = VARIABLE,
-STORAGE= extended,
--- make it a non-preferred member of string type category
-CATEGORY   = 'S',
-PREFERRED  = false,
-COLLATABLE = true
-);
-
---
--- Type casting functions for those situations where the I/O casts don't
--- automatically kick in.
---
-
-CREATE FUNCTION citext(bpchar)
-RETURNS citext
-AS 'rtrim1'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citext(boolean)
-RETURNS citext
-AS 'booltext'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citext(inet)
-RETURNS citext
-AS 'network_show'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
---
---  Implicit and assignment type casts.
---
-
-CREATE CAST (citext AS text)WITHOUT FUNCTION AS IMPLICIT;
-CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS

Re: Fwd: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-07-12 Thread Alexey Chernyshov

Thank you for the patch and benchmark results, I have a couple remarks.
Firstly, padding in DeadTuplesSegment

typedef struct DeadTuplesSegment

{

ItemPointerData last_dead_tuple;/* Copy of the last dead tuple 
(unset


 * until the segment is fully

 * populated). Keep it first to 
simplify


 * binary searches */

unsigned short padding;/* Align dt_tids to 32-bits,

 * sizeof(ItemPointerData) is aligned to

 * short, so add a padding short, to 
make the


 * size of DeadTuplesSegment a multiple of

 * 32-bits and align integer components 
for


 * better performance during lookups 
into the


 * multiarray */

intnum_dead_tuples;/* # of entries in the segment */

intmax_dead_tuples;/* # of entries allocated in the 
segment */


ItemPointer dt_tids;/* Array of dead tuples */

}DeadTuplesSegment;

In the comments to ItemPointerData is written that it is 6 bytes long, 
but can be padded to 8 bytes by some compilers, so if we add padding in 
a current way, there is no guaranty that it will be done as it is 
expected. The other way to do it with pg_attribute_alligned. But in my 
opinion, there is no need to do it manually, because the compiler will 
do this optimization itself.


On 11.07.2017 19:51, Claudio Freire wrote:

-

+   /* Search for the segment likely to contain the item pointer */
+   iseg = vac_itemptr_binsrch(
+   (void *) itemptr,
+   (void *)
&(vacrelstats->dead_tuples.dt_segments->last_dead_tuple),
+   vacrelstats->dead_tuples.last_seg + 1,
+   sizeof(DeadTuplesSegment));
+

I think that we can change the above to;

+   /* Search for the segment likely to contain the item pointer */
+   iseg = vac_itemptr_binsrch(
+   (void *) itemptr,
+   (void *) &(seg->last_dead_tuple),
+   vacrelstats->dead_tuples.last_seg + 1,
+   sizeof(DeadTuplesSegment));

We set "seg = vacrelstats->dead_tuples.dt_segments" just before this.

Right
In my mind, if you change vacrelstats->dead_tuples.last_seg + 1 with 
GetNumDeadTuplesSegments(vacrelstats), it would be more meaningful.
Besides, you can change the vac_itemptr_binsrch within the segment with 
stdlib bsearch, like:


res = (ItemPointer) bsearch((void *) itemptr,

(void *) seg->dt_tids,

seg->num_dead_tuples,

sizeof(ItemPointerData),

vac_cmp_itemptr);

return (res != NULL);


Those are before the changes in the review. While I don't expect any
change, I'll re-run some of them just in case, and try to investigate
the slowdown. But that will take forever. Each run takes about a week
on my test rig, and I don't have enough hardware to parallelize the
tests. I will run a test on a snapshot of a particularly troublesome
production database we have, that should be interesting.

Very interesting, waiting for the results.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers