Re: [HACKERS] new function for tsquery creartion
Hi all, I am extending phrase operator is such way that it will have <n,m> syntax that means from n to m words, so I will use such syntax (<n,m>) further. I found that a AROUND(N) b is exactly the same as a <-N,N> b and it can be replaced while parsing. So, what do you think of such idea? In this patch I have noticed some unobvious behavior. # select to_tsvector('Hello, cat world!') @@ queryto_tsquery('cat AROUND(1) cat') as match; match --- t cat AROUND(1) cat is the same is "cat <1> cat || cat <0> cat" and: # select to_tsvector('Hello, cat world!') @@ to_tsquery('cat <0> cat'); ?column? --- t It seems to be a proper logic behavior but it is a possible pitfall, maybe it should be documented? But more important question is how AROUND() operator should handle stop words? Now it works as: # select queryto_tsquery('cat <2> (a AROUND(10) rat)'); queryto_tsquery -- 'cat' <12> 'rat' (1 row) # select queryto_tsquery('cat <2> a AROUND(10) rat'); queryto_tsquery 'cat' AROUND(12) 'rat' (1 row) In my opinion it should be like: cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-8,12> rat cat <2> a AROUND(10) rat == cat <2,2> a <-10,10> rat = cat <-8, 12> rat Now <n,m> operator can be replaced with combination of phrase operator , AROUND(), and logical operators, but with <n,m> operator it will be much painless. Correct me, please, if I am wrong. -- Alexey Chernyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module
On Tue, 12 Sep 2017 12:59:20 -0400 Tom Lane <t...@sss.pgh.pa.us> wrote: > Quick comment on this patch: recently, we've decided that having > patches replace the whole base script for an extension is too much of > a maintenance problem, especially when there are several patches in > the pipeline for the same contrib module. The new style is to > provide only a version update script (which you'd have to write > anyway), and then rely on CREATE EXTENSION to apply the old base > script plus the update(s). You can see some examples in the patch I > just posted at > > https://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us > > Also, since that patch is probably going to get committed pretty > soon, you could reformulate your patch as an add-on to its > citext--1.4--1.5.sql script. We don't really need to have a separate > version of the extension for states that are intermediate between two > PG major releases. Only if your patch doesn't get in by v11 freeze > would you need to make it a separate citext--1.5--1.6.sql script. > > regards, tom lane Accented characters and different length strings tests added. Since patch (ttps://www.postgresql.org/message-id/24721.1505229...@sss.pgh.pa.us) is committed, I changed the patch as you said. Thanks for your notes. Do we need expected/citext.out? It seems that only expected/citext_1.out has correct output. -- Alexey Chernyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 35aa8d7a1890a242cdfb3bed6cabaa3b39766f1f Mon Sep 17 00:00:00 2001 From: Alexey Chernyshov <a.chernys...@postgrespro.ru> Date: Tue, 18 Jul 2017 13:50:19 +0300 Subject: [PATCH] patch applied --- contrib/citext/citext--1.4--1.5.sql | 74 +++ contrib/citext/citext.c | 120 contrib/citext/expected/citext.out | 370 +++ contrib/citext/expected/citext_1.out | 370 +++ contrib/citext/sql/citext.sql| 78 5 files changed, 1012 insertions(+) diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql index 97942cb..5ae522b 100644 --- a/contrib/citext/citext--1.4--1.5.sql +++ b/contrib/citext/citext--1.4--1.5.sql @@ -12,3 +12,77 @@ ALTER OPERATOR >= (citext, citext) SET ( RESTRICT = scalargesel, JOIN = scalargejoinsel ); + +CREATE FUNCTION citext_pattern_lt( citext, citext ) +RETURNS bool +AS 'MODULE_PATHNAME' +LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; + +CREATE FUNCTION citext_pattern_le( citext, citext ) +RETURNS bool +AS 'MODULE_PATHNAME' +LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; + +CREATE FUNCTION citext_pattern_gt( citext, citext ) +RETURNS bool +AS 'MODULE_PATHNAME' +LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; + +CREATE FUNCTION citext_pattern_ge( citext, citext ) +RETURNS bool +AS 'MODULE_PATHNAME' +LANGUAGE C IMMUTABLE STRICT PARALLEL SAFE; + +CREATE OPERATOR ~<~ ( +LEFTARG= CITEXT, +RIGHTARG = CITEXT, +NEGATOR= ~>=~, +COMMUTATOR = ~>~, +PROCEDURE = citext_pattern_lt, +RESTRICT = scalarltsel, +JOIN = scalarltjoinsel +); + +CREATE OPERATOR ~<=~ ( +LEFTARG= CITEXT, +RIGHTARG = CITEXT, +NEGATOR= ~>~, +COMMUTATOR = ~>=~, +PROCEDURE = citext_pattern_le, +RESTRICT = scalarltsel, +JOIN = scalarltjoinsel +); + +CREATE OPERATOR ~>=~ ( +LEFTARG= CITEXT, +RIGHTARG = CITEXT, +NEGATOR= ~<~, +COMMUTATOR = ~<=~, +PROCEDURE = citext_pattern_ge, +RESTRICT = scalargtsel, +JOIN = scalargtjoinsel +); + +CREATE OPERATOR ~>~ ( +LEFTARG= CITEXT, +RIGHTARG = CITEXT, +NEGATOR= ~<=~, +COMMUTATOR = ~<~, +PROCEDURE = citext_pattern_gt, +RESTRICT = scalargtsel, +JOIN = scalargtjoinsel +); + +CREATE FUNCTION citext_pattern_cmp(citext, citext) +RETURNS int4 +AS 'MODULE_PATHNAME' +LANGUAGE C STRICT IMMUTABLE PARALLEL SAFE; + +CREATE OPERATOR CLASS citext_pattern_ops +FOR TYPE CITEXT USING btree AS +OPERATOR1 ~<~ (citext, citext), +OPERATOR2 ~<=~ (citext, citext), +OPERATOR3 =(citext, citext), +OPERATOR4 ~>=~ (citext, citext), +OPERATOR5 ~>~ (citext, citext), +FUNCTION1 citext_pattern_cmp(citext, citext); diff --git a/contrib/citext/citext.c b/contrib/citext/citext.c index 0ba4782..189105a 100644 --- a/contrib/citext/citext.c +++ b/contrib/citext/citext.c @@ -18,6 +18,7 @@ PG_MODULE_MAGIC; */ static int32 citextcmp(text *left, text *right, Oid collid); +static int32 internal_citext_pattern_cmp(text *left, text *right, Oid collid); /* * = @@ -59,6 +60,40 @@ citextcmp(text *left, text *right, Oid collid) } /* + * citext_pattern_cmp() + * Internal character-by-character comparison function for c
Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel
On Sat, 9 Sep 2017 13:53:35 +0530 Ashutosh Sharma <ashu.coe...@gmail.com> wrote: > > Finally, i got some time to look into this patch and surprisingly I > didn't find any function returning information at page level instead > all the SQL functions are returning information at index level. > Therefore, i too feel that it should be either integrated with > pgstattuple which could a better match as Tomas also mentioned or may > be add a new contrib module itself. I think, adding a new contrib > module looks like a better option. The reason being, it doesn't simply > include the function for showing index level statistics (for e.g.. > index size, no of rows, values..e.t.c) like pgstattuple does but, > also, displays information contained in a page for e.g. the object > stored in the page, it's tid e.t.c. So, more or less, I would say > that, it's the mixture of pageinspect and pgstattuple module and > therefore, i feel, adding it as a new extension would be a better > choice. Thought? Thank you for your interest, I will add a new contrib module named, say, indexstat. I think we can add statistics on other indexes in the future. -- Alexey Chernyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: tested, passed Documentation:tested, passed One thing I have noticed is a trailing whitespace after "bogus one": 123 +* If we don't have to fetch the tuple, just return a 124 +* bogus one 125 +*/ The new status of this patch is: Ready for Committer -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans
The following review has been posted through the commitfest application: make installcheck-world: tested, failed Implements feature: not tested Spec compliant: not tested Documentation:not tested Hi Alexander, make check-world fails on contrib/postgres_fdw because of Subquery Scan on ... Probably, query plan was changed. The new status of this patch is: Waiting on Author -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module
Hi all, The attached patch introduces citext_pattern_ops for citext extension type like text_pattern_ops for text type. Here are operators ~<~, ~<=~, ~>~, ~>=~ combined into citext_pattern_ops operator class. These operators simply compare underlying citext values as C strings with memcmp() function. This operator class isn’t supported by B-Tree index yet, but it is a first step to do it. Patch includes regression tests and is applicable to the latest commit (c85ec643ff2586e2d144374f51f93bfa215088a2). The problem of citext support for LIKE operator with B-Tree index was mentioned in [1]. Briefly, the planner doesn’t use B-Tree index for queries text_col LIKE ‘abc%’. I’d like to investigate how to improve it and make another patch. I think the start point is match_special_index_operator() function which doesn’t support custom types. I would appreciate hearing your opinion on this. 1. https://www.postgresql.org/message-id/3924.1480351187%40sss.pgh.pa.us -- Alexey Chernyshov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From 6ff180548209fb3abb66d70686df728b307635e3 Mon Sep 17 00:00:00 2001 From: Alexey Chernyshov <a.chernys...@postgrespro.ru> Date: Fri, 23 Jun 2017 14:40:04 +0300 Subject: [PATCH 1/3] add citext opclass citext_pattern_ops --- contrib/citext/Makefile | 2 +- contrib/citext/citext--1.4--1.5.sql | 1 + contrib/citext/citext--1.4.sql | 501 -- contrib/citext/citext--1.5.sql | 575 +++ contrib/citext/citext.c | 120 contrib/citext/citext.control| 2 +- contrib/citext/expected/citext.out | 334 contrib/citext/expected/citext_1.out | 334 contrib/citext/sql/citext.sql| 72 + 9 files changed, 1438 insertions(+), 503 deletions(-) create mode 100644 contrib/citext/citext--1.4--1.5.sql delete mode 100644 contrib/citext/citext--1.4.sql create mode 100644 contrib/citext/citext--1.5.sql diff --git a/contrib/citext/Makefile b/contrib/citext/Makefile index 563cd22..8474e86 100644 --- a/contrib/citext/Makefile +++ b/contrib/citext/Makefile @@ -3,7 +3,7 @@ MODULES = citext EXTENSION = citext -DATA = citext--1.4.sql citext--1.3--1.4.sql \ +DATA = citext--1.5.sql citext--1.4--1.5.sql citext--1.3--1.4.sql \ citext--1.2--1.3.sql citext--1.1--1.2.sql \ citext--1.0--1.1.sql citext--unpackaged--1.0.sql PGFILEDESC = "citext - case-insensitive character string data type" diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql new file mode 100644 index 000..4c8abc0 --- /dev/null +++ b/contrib/citext/citext--1.4--1.5.sql @@ -0,0 +1 @@ +/* contrib/citext/citext--1.4--1.5.sql */ diff --git a/contrib/citext/citext--1.4.sql b/contrib/citext/citext--1.4.sql deleted file mode 100644 index 7b06198..000 --- a/contrib/citext/citext--1.4.sql +++ /dev/null @@ -1,501 +0,0 @@ -/* contrib/citext/citext--1.4.sql */ - --- complain if script is sourced in psql, rather than via CREATE EXTENSION -\echo Use "CREATE EXTENSION citext" to load this file. \quit - --- --- PostgreSQL code for CITEXT. --- --- Most I/O functions, and a few others, piggyback on the "text" type --- functions via the implicit cast to text. --- - --- --- Shell type to keep things a bit quieter. --- - -CREATE TYPE citext; - --- --- Input and output functions. --- -CREATE FUNCTION citextin(cstring) -RETURNS citext -AS 'textin' -LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE; - -CREATE FUNCTION citextout(citext) -RETURNS cstring -AS 'textout' -LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE; - -CREATE FUNCTION citextrecv(internal) -RETURNS citext -AS 'textrecv' -LANGUAGE internal STABLE STRICT PARALLEL SAFE; - -CREATE FUNCTION citextsend(citext) -RETURNS bytea -AS 'textsend' -LANGUAGE internal STABLE STRICT PARALLEL SAFE; - --- --- The type itself. --- - -CREATE TYPE citext ( -INPUT = citextin, -OUTPUT = citextout, -RECEIVE= citextrecv, -SEND = citextsend, -INTERNALLENGTH = VARIABLE, -STORAGE= extended, --- make it a non-preferred member of string type category -CATEGORY = 'S', -PREFERRED = false, -COLLATABLE = true -); - --- --- Type casting functions for those situations where the I/O casts don't --- automatically kick in. --- - -CREATE FUNCTION citext(bpchar) -RETURNS citext -AS 'rtrim1' -LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE; - -CREATE FUNCTION citext(boolean) -RETURNS citext -AS 'booltext' -LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE; - -CREATE FUNCTION citext(inet) -RETURNS citext -AS 'network_show' -LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE; - --- --- Implicit and assignment type casts. --- - -CREATE CAST (citext AS text)WITHOUT FUNCTION AS IMPLICIT; -CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS
Re: Fwd: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Thank you for the patch and benchmark results, I have a couple remarks. Firstly, padding in DeadTuplesSegment typedef struct DeadTuplesSegment { ItemPointerData last_dead_tuple;/* Copy of the last dead tuple (unset * until the segment is fully * populated). Keep it first to simplify * binary searches */ unsigned short padding;/* Align dt_tids to 32-bits, * sizeof(ItemPointerData) is aligned to * short, so add a padding short, to make the * size of DeadTuplesSegment a multiple of * 32-bits and align integer components for * better performance during lookups into the * multiarray */ intnum_dead_tuples;/* # of entries in the segment */ intmax_dead_tuples;/* # of entries allocated in the segment */ ItemPointer dt_tids;/* Array of dead tuples */ }DeadTuplesSegment; In the comments to ItemPointerData is written that it is 6 bytes long, but can be padded to 8 bytes by some compilers, so if we add padding in a current way, there is no guaranty that it will be done as it is expected. The other way to do it with pg_attribute_alligned. But in my opinion, there is no need to do it manually, because the compiler will do this optimization itself. On 11.07.2017 19:51, Claudio Freire wrote: - + /* Search for the segment likely to contain the item pointer */ + iseg = vac_itemptr_binsrch( + (void *) itemptr, + (void *) &(vacrelstats->dead_tuples.dt_segments->last_dead_tuple), + vacrelstats->dead_tuples.last_seg + 1, + sizeof(DeadTuplesSegment)); + I think that we can change the above to; + /* Search for the segment likely to contain the item pointer */ + iseg = vac_itemptr_binsrch( + (void *) itemptr, + (void *) &(seg->last_dead_tuple), + vacrelstats->dead_tuples.last_seg + 1, + sizeof(DeadTuplesSegment)); We set "seg = vacrelstats->dead_tuples.dt_segments" just before this. Right In my mind, if you change vacrelstats->dead_tuples.last_seg + 1 with GetNumDeadTuplesSegments(vacrelstats), it would be more meaningful. Besides, you can change the vac_itemptr_binsrch within the segment with stdlib bsearch, like: res = (ItemPointer) bsearch((void *) itemptr, (void *) seg->dt_tids, seg->num_dead_tuples, sizeof(ItemPointerData), vac_cmp_itemptr); return (res != NULL); Those are before the changes in the review. While I don't expect any change, I'll re-run some of them just in case, and try to investigate the slowdown. But that will take forever. Each run takes about a week on my test rig, and I don't have enough hardware to parallelize the tests. I will run a test on a snapshot of a particularly troublesome production database we have, that should be interesting. Very interesting, waiting for the results. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers