Re: Very newbie question

2023-10-25 Thread Олег Самойлов
Okey, I see no one was be able to solve this problem. But I could. May be for 
someone this will be useful too. There is solution.

Original query was:

> 23 окт. 2023 г., в 18:13, Олег Самойлов  написал(а):
> 
> SELECT id/1000 as partition
>   FROM delivery
>   GROUP BY partition
>   HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;

And I was not able to accelerate it by any index, works 5 minutes. Now query is:

SELECT generate_series(min(id)/1000, max(id)/1000) AS n FROM delivery) 
as part_numbers
 WHERE (SELECT max(created_at) from delivery where n*1000 <=id and 
id < (n+1)*1000)
< CURRENT_DATE-'3 month'::interval;

Return the same (number of partition need to archive), accelerated by two btree 
index: on id and created_at. Works very quick, less then second.



Re: Very newbie question

2023-10-23 Thread Олег Самойлов
This is not correct. An index can accelerate, for instance, max(). Here is also 
not WHERE or ORDER BY, but index is useful:
select max(created_at) from delivery;

> 23 окт. 2023 г., в 18:23, Toomas  написал(а):
> 
> 
> There is no reason to use index. The query has neither WHERE nor ORDER BY 
> clause.
> 
> Toomas
> 
>> On 23. Oct 2023, at 18:13, Олег Самойлов  wrote:
>> 
>> Back pardon, but I have a very newbie question. I have a partitioned table, 
>> partitioned by primary bigint key, size of partition 1000. I need to get 
>> the number of partition which need to archive, which has all rows are olden 
>> then 3 month. Here is query:
>> 
>> SELECT id/1000 as partition
>>  FROM delivery
>>  GROUP BY partition
>>  HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;
>> 
>> The 'id/1000 as partition' is a number of the partition, it later will 
>> be used inside the partition name.
>> The query runs long by sequence scan. Has anyone any ideas how to rewrite 
>> query so it will use any index?
>> 
> 





Re: Very newbie question

2023-10-23 Thread Олег Самойлов
Well, get list of partitions and later scan one by one all 100 partitions is 
too simple. :) I am interesting is here more elegant way? Any rewriting the 
query, any creating an index are permitted.

> 23 окт. 2023 г., в 18:25, Francisco Olarte  
> написал(а):
> 
> On Mon, 23 Oct 2023 at 17:14, Олег Самойлов  wrote:
>> Back pardon, but I have a very newbie question. I have a partitioned table, 
>> partitioned by primary bigint key, size of partition 1000. I need to get 
>> the number of partition which need to archive, which has all rows are olden 
>> then 3 month. Here is query:
>> 
>> SELECT id/1000 as partition
>>   FROM delivery
>>   GROUP BY partition
>>   HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;
>> 
>> The 'id/1000 as partition' is a number of the partition, it later will 
>> be used inside the partition name.
>> The query runs long by sequence scan. Has anyone any ideas how to rewrite 
>> query so it will use any index?
> 
> You should send an explain of your query, and your table and index definition.
> 
> Unless you are tied to do this in one query, and assuming you have an
> index by "created_at", I normally do these kind of things by:
> 1.- Get list of partitions, sort oldest first.
> 2.- do "select created_at from $partition order by created at desc
> limit 1", which normally is just an index lookup, and compare
> client-side.
> You can do the date math in the database too. Also, rhs of the
> comparison seems to be date, if created_at is timestamp you may be
> blocking the optimizer for some things.
> 
> Francisco Olarte.





Very newbie question

2023-10-23 Thread Олег Самойлов
Back pardon, but I have a very newbie question. I have a partitioned table, 
partitioned by primary bigint key, size of partition 1000. I need to get 
the number of partition which need to archive, which has all rows are olden 
then 3 month. Here is query:

SELECT id/1000 as partition
   FROM delivery
   GROUP BY partition
   HAVING max(created_at) < CURRENT_DATE - '3 month'::interval;

The 'id/1000 as partition' is a number of the partition, it later will be 
used inside the partition name.
The query runs long by sequence scan. Has anyone any ideas how to rewrite query 
so it will use any index?



Re: Postgresql HA cluster

2023-10-12 Thread Олег Самойлов
For Postgresql HA cluster the most popular solution is streaming replication. There is an option how implement this. Web programmer approach is using haproxy+consul+patrony. The "old schoool" is using Pacemaker, all in one bottle and well tested. If you interesting how implement in Pacemaker, you may look at my project of testbed that continuously test different HA clusters by random failures. I don't see such for  haproxy+consul+patrony.https://github.com/domclick/tuchanka11.10.2023, 21:02, "Jason Grammenos" :

Hello,
 
I am new to PostgreSQL and having to migrate a Cassandra cluster to PostgreSQL.
Have a reasonable amount of experience with MySQL and use Master -> Master MySQL replication quite successfully.
 
I am trying to understand what configuration/option to pick for PostgreSQL that will provide the same or similar level of operational easy as the Cassandra cluster. What I mean by that is : Backups, Patching (rebooting nodes to patch)
With Cassandra any single node can be patched and rebooting without incurring downtime. This is also true with MySQL master->master replication, which we have fronted by a HAproxy reverse proxy (we can fail over between the two MySQL node,
 draining the connections as we do so and then patching and reboot). 
 
I have found the following documentation:
https://www.postgresql.org/docs/current/different-replication-solutions.html
 
and have heard of “Traktor”, “pgEdge” and some other third-party tools.

My difficulty is that with very little knowledge of PostgreSQL I am having a hard time sorting through the various options and determining which are potential candidates for the kind of setup I am looking for (something approximating MySQL
 master->master replication).
 
Any advice or resources would be much appreciated.
Regards
Jason Grammenos
 

Jason Grammenos | Operations & Infrastructure Analyst  
Pronouns: he/him
P: 613.232.7797 x1131
Toll-free: 866.545.3745 x1131
jason.gramme...@agilitypr.com
agilitypr.com
Learn new PR tips from our free
 resources.
 




Re: Can not drop partition if exist foreign keys

2023-10-12 Thread Олег Самойлов
  11.10.2023, 14:20, "Laurenz Albe" :On Wed, 2023-10-11 at 14:08 +0300, Олег Самойлов wrote: There are two common practice to drop partition from partitioned table: just drop or detach-drop. But simple drop don't work if exist foreign key. Example script attached.That is working as designed. You cannot detach a partition of a table if aforeign key points to it.Nope, of cause any can detach a partition of a empty table even if a foreign key points to it.The problem is other. Documented that there are two variants how drop partition. Just drop or detach then drop. Both work on a simple partition, but only the second variant works in this case.




Can not drop partition if exist foreign keys

2023-10-11 Thread Олег Самойлов
Hi all.
There are two common practice to drop partition from partitioned table: just 
drop or detach-drop. But simple drop don't work if exist foreign key. Example 
script attached.


test.sql
Description: Binary data


$ psql -p 5416 -a -f test.sql
\setenv PSQL_EDITOR 'vim'
\setenv PSQL_EDITOR_LINENUMBER_ARG '+'
\set ON_ERROR_ROLLBACK 'interactive'
\set ON_ERROR_STOP 'on'
--\set SHOW_CONTEXT 'always'
\set PROMPT1 '%[%033[38;5;'`echo $PROMPT_COLOR`'m%]%x%n@%m:%>/%/\n%R%# 
%[%033[m%]'
\set PROMPT2 '%[%033[38;5;'`echo $PROMPT_COLOR`'m%]%R%# %[%033[m%]'
BEGIN;
BEGIN
CREATE TABLE parent (
id  int primary key
) PARTITION BY RANGE (id);
CREATE TABLE
CREATE TABLE parent_0 PARTITION OF parent
FOR VALUES FROM (0) TO (100);
CREATE TABLE
CREATE TABLE children (
id  int primary key references parent(id)
) PARTITION BY RANGE (id);
CREATE TABLE
CREATE TABLE children_0 PARTITION OF children
FOR VALUES FROM (0) TO (100);
CREATE TABLE
DROP TABLE children_0;
DROP TABLE
DROP TABLE parent_0;
psql:test.sql:15: ERROR:  cannot drop table parent_0 because other objects 
depend on it
DETAIL:  constraint children_id_fkey on table children depends on table parent_0
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

Looked like a bug.

Re: BUG? Slave don't reconnect to the master

2020-09-29 Thread Олег Самойлов



> On 29 Sep 2020, at 17:16, Jehan-Guillaume de Rorthais  wrote:
> 
> Sorry, there was some misunderstanding of the source code on my side. The
> "keepalives" parameter is enabled by default on client side, but if you don't
> set keepalives_idle, keepalives_interval and keepalives_count, they fallback 
> to
> system default ones which are 7200, 75 and 9 (on Debian and CentOS). So more
> than 2 hours.

I can confirm. After roughly 2 hours the situation was somehow resolved and 
tests continued.



Re: BUG? Slave don't reconnect to the master

2020-09-29 Thread Олег Самойлов



> On 29 Sep 2020, at 12:31, Jehan-Guillaume de Rorthais  wrote:
> 
> On Thu, 24 Sep 2020 15:22:46 +0300
> Олег Самойлов  wrote:
> 
>> Hi, Jehan.
>> 
>>> On 9 Sep 2020, at 18:19, Jehan-Guillaume de Rorthais 
>>> wrote:
>>> 
>>> On Mon, 7 Sep 2020 23:46:17 +0300
>>> Олег Самойлов  wrote:
>>> 
>>>>> [...]  
>>>>>>>> 10:30:55.965 FATAL:  terminating walreceiver process dpue to
>>>>>>>> administrator cmd 10:30:55.966 LOG:  redo done at 0/1600C4B0
>>>>>>>> 10:30:55.966 LOG:  last completed transaction was at log time
>>>>>>>> 10:25:38.76429 10:30:55.968 LOG:  selected new timeline ID: 4
>>>>>>>> 10:30:56.001 LOG:  archive recovery complete
>>>>>>>> 10:30:56.005 LOG:  database system is ready to accept connections  
>>>>>>> 
>>>>>>>> The slave with didn't reconnected replication, tuchanka3c. Also I
>>>>>>>> separated logs copied from the old master by a blank line:
>>>>>>>> 
>>>>>>>> [...]
>>>>>>>> 
>>>>>>>> 10:20:25.168 LOG:  database system was interrupted; last known up at
>>>>>>>> 10:20:19 10:20:25.180 LOG:  entering standby mode
>>>>>>>> 10:20:25.181 LOG:  redo starts at 0/1198
>>>>>>>> 10:20:25.183 LOG:  consistent recovery state reached at 0/11000A68
>>>>>>>> 10:20:25.183 LOG:  database system is ready to accept read only
>>>>>>>> connections 10:20:25.193 LOG:  started streaming WAL from primary at
>>>>>>>> 0/1200 on tl 3 10:25:05.370 LOG:  could not send data to client:
>>>>>>>> Connection reset by peer 10:26:38.655 FATAL:  terminating walreceiver
>>>>>>>> due to timeout 10:26:38.655 LOG:  record with incorrect prev-link
>>>>>>>> 0/1200C4B0 at 0/1600C4D8  
>>>>>>> 
>>>>>>> This message appear before the effective promotion of tuchanka3b. Do you
>>>>>>> have logs about what happen *after* the promotion?  
>>>>>> 
>>>>>> This is end of the slave log. Nothing. Just absent replication.
>>>>> 
>>>>> This is unusual. Could you log some more details about replication
>>>>> tryouts to your PostgreSQL logs? Set log_replication_commands and lower
>>>>> log_min_messages to debug ?
>>>> 
>>>> Sure, this is PostgreSQL logs for the cluster tuchanka3.
>>>> Tuchanka3a is an old (failed) master.  
>>> 
>>> According to your logs:
>>> 
>>> 20:29:41 tuchanka3a: freeze
>>> 20:30:39 tuchanka3c: wal receiver timeout (default 60s timeout)
>>> 20:30:39 tuchanka3c: switched to archives, and error'ed (expected)
>>> 20:30:39 tuchanka3c: switched to stream again (expected)
>>>no more news from this new wal receiver 
>>> 20:34:21 tuchanka3b: promoted
>>> 
>>> I'm not sure where your floating IP is located at 20:30:39, but I suppose it
>>> is still on tuchanka3a as the wal receiver don't hit any connection error
>>> and tuchanka3b is not promoted yet.  
>> 
>> I think so.
>> 
>>> 
>>> So at this point, I suppose the wal receiver is stuck in libpqrcv_connect
>>> waiting for frozen tuchanka3a to answer, with no connection timeout. You
>>> might track tcp sockets on tuchanka3a to confirm this.  
>> 
>> I don't know how to do this.
> 
> Use ss, see its manual page. Hare is an example, using standard 5432 pgsql 
> port:
> 
>  ss -tapn 'dport = 5432 or sport = 5432'
> 
> Look for Local and Peer addresses and their status.
> 
>>> To avoid such a wait, try to add eg. connect_timeout=2 to your
>>> primary_conninfo parameter. See:
>>> https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS
>>>   
>> 
>> Nope, this was not enough. But I went further and I added tcp keepalive
>> options too. So now paf file, for instance in tuchanka3c, is:
>> 
>> # recovery.conf for krogan3, pgsqlms pacemaker module
>> primary_conninfo = 'host=krogan3 user=replicant application_name=tuchanka3c
>> connect_timeout=5 keepalives=1 keepalives_idle=1 keepalives_interval=3
>> keepalives_count=3' recovery_target_timeline = 'latest' standby_mode = 'on'
>> 
>> And now the problem with PostgreSQL-STOP i

Re: BUG? Slave don't reconnect to the master

2020-09-24 Thread Олег Самойлов
Hi, Jehan.

> On 9 Sep 2020, at 18:19, Jehan-Guillaume de Rorthais  wrote:
> 
> On Mon, 7 Sep 2020 23:46:17 +0300
> Олег Самойлов  wrote:
> 
>>> [...]
>>>>>> 10:30:55.965 FATAL:  terminating walreceiver process dpue to
>>>>>> administrator cmd 10:30:55.966 LOG:  redo done at 0/1600C4B0
>>>>>> 10:30:55.966 LOG:  last completed transaction was at log time
>>>>>> 10:25:38.76429 10:30:55.968 LOG:  selected new timeline ID: 4
>>>>>> 10:30:56.001 LOG:  archive recovery complete
>>>>>> 10:30:56.005 LOG:  database system is ready to accept connections
>>>>> 
>>>>>> The slave with didn't reconnected replication, tuchanka3c. Also I
>>>>>> separated logs copied from the old master by a blank line:
>>>>>> 
>>>>>> [...]
>>>>>> 
>>>>>> 10:20:25.168 LOG:  database system was interrupted; last known up at
>>>>>> 10:20:19 10:20:25.180 LOG:  entering standby mode
>>>>>> 10:20:25.181 LOG:  redo starts at 0/1198
>>>>>> 10:20:25.183 LOG:  consistent recovery state reached at 0/11000A68
>>>>>> 10:20:25.183 LOG:  database system is ready to accept read only
>>>>>> connections 10:20:25.193 LOG:  started streaming WAL from primary at
>>>>>> 0/1200 on tl 3 10:25:05.370 LOG:  could not send data to client:
>>>>>> Connection reset by peer 10:26:38.655 FATAL:  terminating walreceiver
>>>>>> due to timeout 10:26:38.655 LOG:  record with incorrect prev-link
>>>>>> 0/1200C4B0 at 0/1600C4D8
>>>>> 
>>>>> This message appear before the effective promotion of tuchanka3b. Do you
>>>>> have logs about what happen *after* the promotion?
>>>> 
>>>> This is end of the slave log. Nothing. Just absent replication.  
>>> 
>>> This is unusual. Could you log some more details about replication
>>> tryouts to your PostgreSQL logs? Set log_replication_commands and lower
>>> log_min_messages to debug ?  
>> 
>> Sure, this is PostgreSQL logs for the cluster tuchanka3.
>> Tuchanka3a is an old (failed) master.
> 
> According to your logs:
> 
> 20:29:41 tuchanka3a: freeze
> 20:30:39 tuchanka3c: wal receiver timeout (default 60s timeout)
> 20:30:39 tuchanka3c: switched to archives, and error'ed (expected)
> 20:30:39 tuchanka3c: switched to stream again (expected)
> no more news from this new wal receiver 
> 20:34:21 tuchanka3b: promoted
> 
> I'm not sure where your floating IP is located at 20:30:39, but I suppose it
> is still on tuchanka3a as the wal receiver don't hit any connection error and
> tuchanka3b is not promoted yet.

I think so.

> 
> So at this point, I suppose the wal receiver is stuck in libpqrcv_connect
> waiting for frozen tuchanka3a to answer, with no connection timeout. You might
> track tcp sockets on tuchanka3a to confirm this.

I don't know how to do this.

> 
> To avoid such a wait, try to add eg. connect_timeout=2 to your 
> primary_conninfo
> parameter. See:
> https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-PARAMKEYWORDS

Nope, this was not enough. But I went further and I added tcp keepalive options 
too. So now paf file, for instance in tuchanka3c, is:

# recovery.conf for krogan3, pgsqlms pacemaker module
primary_conninfo = 'host=krogan3 user=replicant application_name=tuchanka3c 
connect_timeout=5 keepalives=1 keepalives_idle=1 keepalives_interval=3 
keepalives_count=3'
recovery_target_timeline = 'latest'
standby_mode = 'on'

And now the problem with PostgreSQL-STOP is solved. But I surprised, why this 
was needed? I though that wal_receiver_timeout must be enough for this case.

I need some more time to check this solution with other tests.



Re: Tuchanka

2020-09-07 Thread Олег Самойлов



> On 3 Sep 2020, at 17:25, Jehan-Guillaume de Rorthais  wrote:
> 
> On Wed, 2 Sep 2020 20:33:09 +0300
> Олег Самойлов  wrote:
> 
>> The software is rather outdated. It works with PostgreSQL 11 and CentOS 7.
>> The next step will be upgrading to CentOS 8 and PostgreSQL 12. Please tell
>> me, is it useful and worth to continue? Where is better announce it? May be
>> somewhere exists special mailing list for such things.
> 
> I answered on Pacemaker mailing list with a couple of links and questions
> to discuss before answering if it useful to continue.
> 
> Thanks!

Okey, lets move to the pacemaker mail list.



Re: BUG? Slave don't reconnect to the master

2020-09-07 Thread Олег Самойлов
Hi

> On 3 Sep 2020, at 14:39, Jehan-Guillaume de Rorthais  wrote:
> On Mon, 24 Aug 2020 18:45:42 +0300
> Олег Самойлов  wrote:
> 
>>> 21 авг. 2020 г., в 17:26, Jehan-Guillaume de Rorthais 
>>> написал(а):
>>> 
>>> On Thu, 20 Aug 2020 15:16:10 +0300
>>> Based on setup per node, you can probably add
>>> 'synchronous_commit=remote_write' in the common conf.  
>> 
>> Nope. I set 'synchronous_commit=remote_write' only for 3 and 4 node clusters.
>> [...]
> 
> Then I suppose your previous message had an error as it shows three
> nodes tuchanka3a, tuchanka3b and tuchanka3c (no 4th node), all with 
> remote_write
> in krogan3.conf. But anyway.

I tested 4 different types of clusters. The cluster 1 and 2 has two nodes and 
thus don't reveal this bug. The cluster 3 and 4 has 3 and 4 nodes and thus this 
bug is observed. I used the cluster 3 as example.

> 
>>>> [...]
>>>> pacemaker config, specific for this cluster:
>>>> [...]  
>>> 
>>> why did you add "monitor interval=15"? No harm, but it is redundant with
>>> "monitor interval=16 role=Master" and "monitor interval=17 role=Slave".  
>> 
>> I can't remember clearly. :) Look what happens without it.
>> 
>> + pcs -f configured_cib.xml resource create krogan2DB ocf:heartbeat:pgsqlms
>> bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan2
>> recovery_template=/var/lib/pgsql/krogan2.paf meta master notify=true
>> resource-stickiness=10
>> Warning: changing a monitor operation interval from 15 to 16 to make the
>> operation unique
>> Warning: changing a monitor operation interval from 16 to 17 to make the
>> operation unique
> 
> Something fishy here. This command lack op monitor settings. Pacemaker don't
> add any default monitor operation with default interval if you don't give one
> at resource creation.
> 
> If you create such a resource with no monitoring, the cluster will start/stop
> it when needed, but will NOT check for its health. See: 
> 
> https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-resource-monitoring.html

May be. But keep in mind, that I uses `pcs`, I do not edit the xml file 
directly. And I use too old pacemaker, the default package of CentOS 7 is 
pacemaker-1.1.21-4.el7.x86_64, while you link of documentation is for Pacemaker 
2.0.

But never mind, this does not concern the discussed bug. 

> 
>> So trivial monitor always exists by default with interval 15.
> 
> nope.

This is not true for CentOS 7. I removed my monitor options, for this example.

pcs cluster cib original_cib.xml
cp original_cib.xml configured_cib.xml
pcs -f configured_cib.xml resource create krogan3DB ocf:heartbeat:pgsqlms 
bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan3 
recovery_template=/var/lib/pgsql/krogan3.paf meta master notify=true 
resource-stickiness=10
pcs -f configured_cib.xml resource create krogan3IP ocf:heartbeat:IPaddr2 
nic=eth0 cidr_netmask=24 ip=192.168.89.35
pcs -f configured_cib.xml resource create krogan3s1IP ocf:heartbeat:IPaddr2 
nic=eth0 cidr_netmask=24 ip=192.168.89.36
pcs -f configured_cib.xml resource create krogan3s2IP ocf:heartbeat:IPaddr2 
nic=eth0 cidr_netmask=24 ip=192.168.89.37
pcs -f configured_cib.xml constraint colocation add krogan3IP with master 
krogan3DB-master INFINITY
pcs -f configured_cib.xml constraint order promote krogan3DB-master then start 
krogan3IP symmetrical=false
pcs -f configured_cib.xml constraint order demote krogan3DB-master then stop 
krogan3IP symmetrical=false kind=Optional
pcs -f configured_cib.xml constraint location krogan3s1IP rule score=-INFINITY 
master-krogan3DB lt integer 0
pcs -f configured_cib.xml constraint location krogan3s2IP rule score=-INFINITY 
master-krogan3DB lt integer 0
pcs -f configured_cib.xml constraint colocation add krogan3s1IP with slave 
krogan3DB-master INFINITY
pcs -f configured_cib.xml constraint colocation add krogan3s2IP with slave 
krogan3DB-master INFINITY
pcs -f configured_cib.xml constraint colocation add krogan3s1IP with 
krogan3s2IP -1000
pcs -f configured_cib.xml constraint order start krogan3DB-master then start 
krogan3s1IP
pcs -f configured_cib.xml constraint order start krogan3DB-master then start 
krogan3s2IP
pcs cluster cib-push configured_cib.xml --wait diff-against=original_cib.xml

13:44:27 j0 root@tuchanka3a:~
# pcs resource show krogan3DB-master
 Master: krogan3DB-master
  Meta Attrs: notify=true resource-stickiness=10
  Resource: krogan3DB (class=ocf provider=heartbeat type=pgsqlms)
   Attributes: bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan3 
recovery_template=/var/lib/pgsql/krogan3.paf
   Operations: demote interval=0s timeout=120 (krogan3DB-demote-interval-0s)
   methods interval=0s timeout=5 (krogan3D

Tuchanka

2020-09-02 Thread Олег Самойлов
Hi all.

I have developed a test bed to test high available clusters based on Pacemaker 
and PostgreSQL. The combination of words "test bed" was given to me by a 
dictionary. For an russian this is rather funny, so, please, tell me is this 
suitable phrase for this? The test bed is deployed on VirtualBox  virtual 
machines (VMs) in MacBook Pro. Totally there will be 12 VMs which will occupy 
36GiB of hard disk. They will form 4 high available clusters (different 
variants). The clusters are automatically created. And can be automatically 
tested. The special script will in loop imitates different faults, wait for 
restoration the cluster, fix the broken node and do next test. The project is 
under MIT license in GitHub and I just have finished translation README to 
English language.

https://github.com/domclick/tuchanka

This test bed can be used to test HA clusters. There is a list of already 
detected problems of Pacemaker and PostgreSQL in the README. And it can be used 
for presentations, thats why it is designed to run inside one MacBook Pro. I 
think this will be much better instead of screenshots or video to show how HA 
clusters survive different faults in the real time.

The software is rather outdated. It works with PostgreSQL 11 and CentOS 7. The 
next step will be upgrading to CentOS 8 and PostgreSQL 12. Please tell me, is 
it useful and worth to continue? Where is better announce it? May be somewhere 
exists special mailing list for such things.



Re: BUG? Slave don't reconnect to the master

2020-08-24 Thread Олег Самойлов



> 21 авг. 2020 г., в 17:26, Jehan-Guillaume de Rorthais  
> написал(а):
> 
> On Thu, 20 Aug 2020 15:16:10 +0300
> Based on setup per node, you can probably add
> 'synchronous_commit=remote_write' in the common conf.

Nope. I set 'synchronous_commit=remote_write' only for 3 and 4 node clusters. 
Two node clusters don't have this setting (use default), because they don't 
have the synchronous commit.

> 
>> [...]
>> pacemaker config, specific for this cluster:
>> [...]
> 
> why did you add "monitor interval=15"? No harm, but it is redundant with
> "monitor interval=16 role=Master" and "monitor interval=17 role=Slave".

I can't remember clearly. :) Look what happens without it.

+ pcs -f configured_cib.xml resource create krogan2DB ocf:heartbeat:pgsqlms 
bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan2 
recovery_template=/var/lib/pgsql/krogan2.paf meta master notify=true 
resource-stickiness=10
Warning: changing a monitor operation interval from 15 to 16 to make the 
operation unique
Warning: changing a monitor operation interval from 16 to 17 to make the 
operation unique

So trivial monitor always exists by default with interval 15.

My real command
pcs -f configured_cib.xml resource create krogan2DB ocf:heartbeat:pgsqlms 
bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan2 
recovery_template=/var/lib/pgsql/krogan2.paf op monitor interval=15 timeout=10 
monitor interval=16 role=Master timeout=15 monitor interval=17 role=Slave 
timeout=10 meta master notify=true resource-stickiness=10

Looked like I needed to add all this to change "timeout" parameter for the 
monitor operations and I needed for interval parameter to point on the specific 
monitor operation. Looked like the default timeout 10 was not enough for the 
"master".

> By the way, nice catch to exclude RO IP when master score is
> negative.
> 
> You should probably consider putting your logs outside of your PGDATA, or
> even better: send your PostgreSQL/Pacemaker logs to a dedicated syslog node
> outside. Adding the hostname in the prefix of each log line might probably 
> help
> a lot.

I have put this into my ToDo notebook.

>> [...]
>> 10:24:55.906 LOG:  entering standby mode
>> 10:24:55.908 LOG:  redo starts at 0/1528
>> 10:24:55.909 LOG:  consistent recovery state reached at 0/15002300
>> 10:24:55.910 LOG:  database system is ready to accept read only connections
>> 10:24:55.928 LOG:  started streaming WAL from primary at 0/1600 on tl 3
>> 10:26:37.308 FATAL:  terminating walreceiver due to timeout
> 
> Timeout because of SIGSTOP on primary here.

Sure

>> 10:26:37.308 LOG:  invalid record length at 0/1600C4D8: wanted 24, got 0
>> 10:30:55.965 LOG:  received promote request
> 
> Promotion from Pacemaker here.

Yep

> What happened during more than 4 minutes between the timeout and the 
> promotion?

It's one of the problem, which you may improve. :) The pacemaker reaction is 
the longest for STOP signal test, usually near 5 minutes. The pacemaker tried 
to make different things (for instance "demote") and wait for different 
timeouts.

> 
>> 10:30:55.965 FATAL:  terminating walreceiver process due to administrator cmd
>> 10:30:55.966 LOG:  redo done at 0/1600C4B0
>> 10:30:55.966 LOG:  last completed transaction was at log time 10:25:38.76429
>> 10:30:55.968 LOG:  selected new timeline ID: 4
>> 10:30:56.001 LOG:  archive recovery complete
>> 10:30:56.005 LOG:  database system is ready to accept connections
> 
>> The slave with didn't reconnected replication, tuchanka3c. Also I separated
>> logs copied from the old master by a blank line:
>> 
>> [...]
>> 
>> 10:20:25.168 LOG:  database system was interrupted; last known up at 10:20:19
>> 10:20:25.180 LOG:  entering standby mode
>> 10:20:25.181 LOG:  redo starts at 0/1198
>> 10:20:25.183 LOG:  consistent recovery state reached at 0/11000A68
>> 10:20:25.183 LOG:  database system is ready to accept read only connections
>> 10:20:25.193 LOG:  started streaming WAL from primary at 0/1200 on tl 3
>> 10:25:05.370 LOG:  could not send data to client: Connection reset by peer
>> 10:26:38.655 FATAL:  terminating walreceiver due to timeout
>> 10:26:38.655 LOG:  record with incorrect prev-link 0/1200C4B0 at 0/1600C4D8
> 
> This message appear before the effective promotion of tuchanka3b. Do you have
> logs about what happen *after* the promotion?

This is end of the slave log. Nothing. Just absent replication.

> Reading at this error, it seems like record at 0/1600C4D8 references the
> previous one in WAL 0/1200. So the file referenced as 0/16 have either
> corrupted data or was 0/12 being recycled, but not zeroed correctly, as v11
> always do no matter what (no wal_init_zero there).

Okey, may be in v12 it will be fixed.

> That's why I'm wondering how you built your standbys, from scratch?

By special scripts. :) This project already on GitHub and I am waiting for the 
final solution of my boss to open it. And it will take some time to translate 
README to English. After 

Re: is date_part immutable or not?

2020-08-21 Thread Олег Самойлов


> 21 авг. 2020 г., в 16:14, Laurenz Albe  написал(а):
> Two approaches:
> 
> 1. Use "timestamp without time zone".

Yep, I redefined to timestamp without time zone. Thus loose timezone 
information from source, but happily there is not the daylight savings time 
shift in my country now.

> 
> 2. Partition in some other way, for example BY RANGE (log_time).
>   Your list partitions don't make a lot of sense to me.

This copies default name structure of the postgresql log files.



Re: is date_part immutable or not?

2020-08-21 Thread Олег Самойлов
My mistake. 

Schema  | pg_catalog
Name| date_part
Result data type| double precision
Argument data types | text, timestamp with time zone
Type| func
Volatility  | stable
Parallel| safe
Owner   | postgres
Security| invoker
Access privileges   |
Language| internal
Source code | timestamptz_part
Description | extract field from timestamp with time zone

is stable, but

Schema  | pg_catalog
Name| date_part
Result data type| double precision
Argument data types | text, date
Type| func
Volatility  | immutable
Parallel| safe
Owner   | postgres
Security| invoker
Access privileges   |
Language| sql
Source code | select pg_catalog.date_part($1, cast($2 as timestamp 
without tim
e zone))
Description | extract field from date

is immutable and
Schema  | pg_catalog
Name| date_part
Result data type| double precision
Argument data types | text, timestamp without time zone
Type| func
Volatility  | immutable
Parallel| safe
Owner   | postgres
Security| invoker
Access privileges   |
Language| internal
Source code | timestamp_part
Description | extract field from timestamp

immutable too. But date_part('isodow', log_time::date) and date_part('isodow', 
log_time::timestamp) don't work too.

How to workaround this?

> 21 авг. 2020 г., в 14:57, Олег Самойлов  написал(а):
> 
> PostgreSQL 12.4
> Just create partitioned table for PostgreSQL logs
> 
> CREATE TABLE pglog.pglog (
>   log_time timestamp(3) with time zone,
>   user_name text,
>   database_name text,
>   process_id integer,
>   connection_from text,
>   session_id text,
>   session_line_num bigint,
>   command_tag text,
>   session_start_time timestamp with time zone,
>   virtual_transaction_id text,
>   transaction_id bigint,
>   error_severity text,
>   sql_state_code text,
>   message text,
>   detail text,
>   hint text,
>   internal_query text,
>   internal_query_pos integer,
>   context text,
>   query text,
>   query_pos integer,
>   location text,
>   application_name text
> ) PARTITION BY LIST (date_part('isodow', log_time));
> 
> ERROR:  functions in partition key expression must be marked IMMUTABLE
> 
> But, date_part is immutable
> 
> Schema  | pg_catalog
> Name| date_part
> Result data type| double precision
> Argument data types | text, time with time zone
> Type| func
> Volatility  | immutable
> Parallel| safe
> Owner   | postgres
> Security| invoker
> Access privileges   |
> Language| internal
> Source code | timetz_part
> Description | extract field from time with time zone
> 
> What is wrong here?
> 
> 





is date_part immutable or not?

2020-08-21 Thread Олег Самойлов
PostgreSQL 12.4
Just create partitioned table for PostgreSQL logs

CREATE TABLE pglog.pglog (
   log_time timestamp(3) with time zone,
   user_name text,
   database_name text,
   process_id integer,
   connection_from text,
   session_id text,
   session_line_num bigint,
   command_tag text,
   session_start_time timestamp with time zone,
   virtual_transaction_id text,
   transaction_id bigint,
   error_severity text,
   sql_state_code text,
   message text,
   detail text,
   hint text,
   internal_query text,
   internal_query_pos integer,
   context text,
   query text,
   query_pos integer,
   location text,
   application_name text
) PARTITION BY LIST (date_part('isodow', log_time));

ERROR:  functions in partition key expression must be marked IMMUTABLE

But, date_part is immutable

Schema  | pg_catalog
Name| date_part
Result data type| double precision
Argument data types | text, time with time zone
Type| func
Volatility  | immutable
Parallel| safe
Owner   | postgres
Security| invoker
Access privileges   |
Language| internal
Source code | timetz_part
Description | extract field from time with time zone

What is wrong here?




Re: BUG? Slave don't reconnect to the master

2020-08-20 Thread Олег Самойлов



> 19 авг. 2020 г., в 16:07, Jehan-Guillaume de Rorthais  
> написал(а):
> 
> On Tue, 18 Aug 2020 13:48:41 +0300
> Олег Самойлов  wrote:
> 
>> Hi all.
>> 
>> I found some strange behaviour of postgres, which I recognise as a bug. First
>> of all, let me explain situation.
>> 
>> I created a "test bed" (not sure how to call it right), to test high
>> availability clusters based on Pacemaker and PostgreSQL. The test bed consist
>> of 12 virtual machines (on VirtualBox) runing on a MacBook Pro and formed 4
>> HA clusters with different structure. And all 4 HA cluster constantly tested
>> in loop: simulated failures with different nature, waited for rising
>> fall-over, fixing, and so on. For simplicity I'll explain only one HA
>> cluster.
>> This is 3 virtual machines, with master on one, and sync and async
>> slaves on other. The PostgreSQL service is provided by float IPs pointed to
>> working master and slaves. Slaves are connected to the master float IP too.
>> When the pacemaker detects a failure, for instance, on the master, it promote
>> a master on other node with lowest latency WAL and switches float IPs, so the
>> third node keeping be a sync slave. My company decided to open this project
>> as an open source, now I am finishing formality.
> 
> As the maintainer of PAF[1], I'm looking forward to discover it :)
> Do not hesitate to ping me offlist as well in regard with Pacemaker and
> resource agents.

I am glad to see you here. Great thanks for PAF module.

> 
>> Almost works fine, but sometimes, rather rare, I detected that a slave don't
>> reconnect to the new master after a failure. First case is PostgreSQL-STOP,
>> when I `kill` by STOP signal postgres on the master to simulate freeze. The
>> slave don't reconnect to the new master with errors in log:
>> 
>> 18:02:56.236 [3154] FATAL:  terminating walreceiver due to timeout
>> 18:02:56.237 [1421] LOG:  record with incorrect prev-link 0/1600DDE8 at
>> 0/1A00DE10
> 
> Do you have more logs from both side of the replication?

There was nothing special. Just errors from previous tests.

> How do you build your standbys?
Okey, I'll show you all configs. Linux Centos 7, PostgreSQL 11 from the 
official postgres repository.
# rpm -q corosync
corosync-2.4.5-4.el7.x86_64
# rpm -q pacemaker
pacemaker-1.1.21-4.el7.x86_64
# rpm -q sbd
sbd-1.4.0-15.el7.x86_64

The cluster designed for three datacenter with good connection between in one 
city , so it must survive a crush of one datacenter. So stonith is not used, 
instead I use quorum and sbd based watchdog. Name of nodes: tuchanka3a, 
tuchanka3b, tuchanka3c. Name for float ip: krogan3 for the master; krogan3s1, 
krogan3s2 for slaves.

postgresql common conf (not default options):

ident_file = '/var/lib/pgsql/pg_ident.conf'
hba_file = '/var/lib/pgsql/pg_hba.conf'
listen_addresses = '*'
log_filename = 'postgresql.%F.log'
wal_keep_segments = 1
restart_after_crash = off   
wal_receiver_timeout=0 # in case PostgreSQL-STOP wal_receiver_timeout is 
default 60s, of cause
shared_buffers = 32MB
max_wal_size=80MB

postgresql conf specific for nodes:
for tuchanka3a:

$ cat krogan3.conf
synchronous_commit = remote_write
synchronous_standby_names = 'ANY 1 (tuchanka3b,tuchanka3c)'

$ cat krogan3.paf
primary_conninfo = 'host=krogan3 user=replicant application_name=tuchanka3a 
sslmode=disable'
recovery_target_timeline = 'latest'
standby_mode = 'on'

for tucanka3b:
$ cat krogan3.conf
synchronous_commit = remote_write
synchronous_standby_names = 'ANY 1 (tuchanka3a,tuchanka3c)'

$ cat krogan3.paf
primary_conninfo = 'host=krogan3 user=replicant application_name=tuchanka3b 
sslmode=disable'
recovery_target_timeline = 'latest'
standby_mode = 'on'

for tuchanka3c:
$ cat krogan3.conf
synchronous_commit = remote_write
synchronous_standby_names = 'ANY 1 (tuchanka3a,tuchanka3b)'

$ cat krogan3.paf
primary_conninfo = 'host=krogan3 user=replicant application_name=tuchanka3c 
sslmode=disable'
recovery_target_timeline = 'latest'
standby_mode = 'on'

pacemaker config, specific for this cluster:

pcs cluster cib original_cib.xml
cp original_cib.xml configured_cib.xml
pcs -f configured_cib.xml resource create krogan3DB ocf:heartbeat:pgsqlms 
bindir=/usr/pgsql-11/bin pgdata=/var/lib/pgsql/krogan3 
recovery_template=/var/lib/pgsql/krogan3.paf op monitor interval=15 timeout=10 
monitor interval=16 role=Master timeout=20 monitor interval=17 role=Slave 
timeout=10 meta master notify=true resource-stickiness=10
pcs -f configured_cib.xml resource create krogan3IP ocf:heartbeat:IPaddr2 
nic=eth0 cidr_netmask=24 ip=192.168.89.35
pcs -f configured_cib.xml resource create krogan3s1IP ocf:heartbeat:IPaddr2 
nic=eth0 cidr_netmask=24 ip=192.168.89.36
pcs -f configured_cib.xml resource create krogan3s2IP ocf:heartbeat:IPadd

BUG? Slave don't reconnect to the master

2020-08-18 Thread Олег Самойлов
Hi all.

I found some strange behaviour of postgres, which I recognise as a bug. First 
of all, let me explain situation.

I created a "test bed" (not sure how to call it right), to test high 
availability clusters based on Pacemaker and PostgreSQL. The test bed consist 
of 12 virtual machines (on VirtualBox) runing on a MacBook Pro and formed 4 HA 
clusters with different structure. And all 4 HA cluster constantly tested in 
loop: simulated failures with different nature, waited for rising fall-over, 
fixing, and so on. For simplicity I'll explain only one HA cluster. This is 3 
virtual machines, with master on one, and sync and async slaves on other. The 
PostgreSQL service is provided by float IPs pointed to working master and 
slaves. Slaves are connected to the master float IP too. When the pacemaker 
detects a failure, for instance, on the master, it promote a master on other 
node with lowest latency WAL and switches float IPs, so the third node keeping 
be a sync slave. My company decided to open this project as an open source, now 
I am finishing formality.

Almost works fine, but sometimes, rather rare, I detected that a slave don't 
reconnect to the new master after a failure. First case is PostgreSQL-STOP, 
when I `kill` by STOP signal postgres on the master to simulate freeze. The 
slave don't reconnect to the new master with errors in log:

18:02:56.236 [3154] FATAL:  terminating walreceiver due to timeout
18:02:56.237 [1421] LOG:  record with incorrect prev-link 0/1600DDE8 at 
0/1A00DE10

What is strange that error about incorrect WAL is risen  after the termination 
of connection. Well, this can be workarouned by turning off wal receiver 
timeout. Now PostgreSQL-STOP works fine, but the problem is still exists with 
other test. ForkBomb simulates an out of memory situation. In this case a slave 
sometimes don't reconnect to the new master too, with errors in log:

10:09:43.99 [1417] FATAL:  could not receive data from WAL stream: server 
closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
10:09:43.992 [1413] LOG:  invalid record length at 0/D8014278: wanted 24, got 0

The last error message (last row in log) was observed different, btw.

What I expect as right behaviour. The PostgreSQL slave must reconnect to the 
master IP (float IP) after the wal_retrieve_retry_interval.



a prefix searching

2020-04-21 Thread Олег Самойлов
I found an interesting information, which I want to share. When I analysed the 
ChangeLog I found:

Add prefix-match operator text ^@ text, which is supported by SP-GiST (Ildus 
Kurbangaliev)
This is similar to using var LIKE 'word%' with a btree index, but it is more 
efficient.

It was interesting for me, because and searching for prefix is enough often 
happened and I knew nothing about such operator.. This operator is based on 
internal function starts_with() and supported by SP-GiST.

The information in the official documentation is really poor.

https://www.postgresql.org/docs/12/functions-string.html
In the section about string operators said nothing about "^@" , only about 
internal function starts_with(), which can not be accelerated by index.

https://www.postgresql.org/docs/12/functions-matching.html
But in the section about pattern matching in subsection about "LIKE" mentioned 
both "^@" operator and internal starts_width(). This is surprise, because "^@" 
has nothing with LIKE or pattern matching. Also nothing said that it is useful 
only with SP-GiST index.

So I decided investigate by myself. I used data from pg_proc for my tests.

First, check btree index:

=> create table test_btree (oid oid primary key,proname text not null);
CREATE TABLE
=> insert into test_btree select oid,proname from pg_proc;
INSERT 0 2960
=> create index on test_btree (proname);
CREATE INDEX
=> analyze test_btree;
ANALYZE
=> explain select * from test_btree where proname ^@ 'bool';
 QUERY PLAN
-
 Seq Scan on test_btree  (cost=0.00..55.00 rows=27 width=17)
   Filter: (proname ^@ 'bool'::text)
(2 rows)
=> explain select * from test_btree where starts_with(proname,'bool');
  QUERY PLAN
--
 Seq Scan on test_btree  (cost=0.00..55.00 rows=987 width=17)
   Filter: starts_with(proname, 'bool'::text)
(2 rows)
=> explain select * from test_btree where proname like 'bool%';
 QUERY PLAN
-
 Seq Scan on test_btree  (cost=0.00..55.00 rows=27 width=17)
   Filter: (proname ~~ 'bool%'::text)
(2 rows)

=> explain select * from test_btree where proname similar to 'bool%';
 QUERY PLAN
-
 Seq Scan on test_btree  (cost=0.00..55.00 rows=27 width=17)
   Filter: (proname ~ '^(?:bool.*)$'::text)
(2 rows)
=> explain select * from test_btree where proname ~ '^bool.*';
 QUERY PLAN
-
 Seq Scan on test_btree  (cost=0.00..55.00 rows=27 width=17)
   Filter: (proname ~ '^bool.*'::text)
(2 rows)

This is surprise, but any prefix searching don't work with common btree index 
based on the local collation. But why not? The prefix searching is expected to 
be used with local collation, as all other text searching.

Next is a lifehack to use "C" collation. I read about this in the very old 
official PostgreSQL documentation.

=> create table test_btree_c (oid oid primary key,proname text not null);
CREATE TABLE
=> insert into test_btree_c select oid,proname from pg_proc;
INSERT 0 2960
=> create index on test_btree_c (proname collate "C");
CREATE INDEX
=> analyze test_btree_c;
ANALYZE
=> explain select * from test_btree_c where proname collate "C" ^@ 'bool';
  QUERY PLAN
---
 Seq Scan on test_btree_c  (cost=0.00..55.00 rows=27 width=17)
   Filter: ((proname)::text ^@ 'bool'::text)
(2 rows)
=> explain select * from test_btree_c where starts_with(proname collate 
"C",'bool');
   QUERY PLAN

 Seq Scan on test_btree_c  (cost=0.00..55.00 rows=987 width=17)
   Filter: starts_with((proname)::text, 'bool'::text)
(2 rows)
=> explain select * from test_btree_c where proname collate "C" like 'bool%';
  QUERY PLAN
---
 Index Scan using test_btree_c_proname_idx on test_btree_c  (cost=0.28..3.26 
rows=27 width=17)
   Index Cond: (((proname)::text >= 'bool'::text) AND ((proname)::text < 
'boom'::text))
   Filter: ((proname)::text ~~ 'bool%'::text)
(3 rows)
=> explain select * from test_btree_c where proname collate "C" similar to 
'bool%';
  QUERY PLAN
---
 Index Scan using test_btree_c_proname_idx on test_btree_c  (cost=0.28..3.26 
rows=27 width=17)
   Index Cond: (((proname)::text >= 'bool'::text) AND ((proname)::text < 
'boom'::text))
   Filter: ((proname)::text ~ '^(?:bool.*)$'::text)
(3 rows)
=> explain select * from 

Re: pg_dump and public schema

2020-03-05 Thread Олег Самойлов
Thanks. I expected that the database restored from its dump must be exactly the 
same. As it was before. But something in PostgresQL changes and not always for 
the good.

> 4 марта 2020 г., в 19:19, Adrian Klaver  
> написал(а):
> 
> I believe this is the latest information on public schema handling:
> 
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=5955d934194c3888f30318209ade71b53d29777f
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com





pg_dump and public schema

2020-03-04 Thread Олег Самойлов
Hi all. PostgresQL 12.1. Strange behaviour with pg_dump and absent public 
schema.

I droped public schema and I work under "username" schema.

=> \dn
List of schemas
 Name  | Owner
---+---
 olleg | olleg
(1 row)

Dump now

pg_dump -U postgres -C olleg >dump.sql

--
-- PostgreSQL database dump
--

-- Dumped from database version 12.1
-- Dumped by pg_dump version 12.1

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: olleg; Type: DATABASE; Schema: -; Owner: olleg
--

CREATE DATABASE olleg WITH TEMPLATE = template0 ENCODING = 'UTF8' LC_COLLATE = 
'ru_RU.UTF-8' LC_CTYPE = 'ru_RU.UTF-8';


ALTER DATABASE olleg OWNER TO olleg;

\connect olleg

SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SELECT pg_catalog.set_config('search_path', '', false);
SET check_function_bodies = false;
SET xmloption = content;
SET client_min_messages = warning;
SET row_security = off;

--
-- Name: olleg; Type: SCHEMA; Schema: -; Owner: olleg
--

CREATE SCHEMA olleg;


ALTER SCHEMA olleg OWNER TO olleg;

--
-- PostgreSQL database dump complete
--

recreate DB from the dump:

psql postgres postgres -f dump.sql

And now I see public schema, which must be absent.

psql olleg olleg

=> \dn
  List of schemas
  Name  |  Owner
+--
 olleg  | olleg
 public | postgres
(2 rows)



Re: initialize and use variable in query

2020-02-05 Thread Олег Самойлов
Just refactor DO block to function that returns row set and put SELECT inside.

> 29 дек. 2018 г., в 18:40, Glenn Schultz  написал(а):
> 
> All,
> 
> I need to initialize a variable and then use it in query.  Ultimately this 
> will part of a recursive CTE but for now I just need to work this out.  I 
> followed the docs and thought I needed something like this.  But does not 
> work-maybe I have misunderstood.  Is this possible?
> 
>   SET max_parallel_workers_per_gather = 8;
>   SET random_page_cost = 1;
>   SET enable_partitionwise_aggregate = on;
>   
>   Do $$
>   Declare startdate date;
>   BEGIN
>   startdate := (select max(fctrdt) from fnmloan);
>   END $$;
>   
>   select 
>   fnmloan.loanseqnum
>   ,fnmloan.currrpb
>   from
>   fnmloan
>   
>   join
>   fnmloan_data
>   on
>   fnmloan_data.loanseqnum = fnmloan.loanseqnum
>   
>   where
>   fnmloan.fctrdt = (select * from startdate)
> 
>   limit 10





PgQ and NOTIFY/LISTEN

2020-01-22 Thread Олег Самойлов
Hi all.

I looked for a forum or a mailing list for PgQ, but didn't find. May be someone 
know here?

Does PgQ use NOTIFY/LISTEN for PgQ internal daemon or consumers. And if not, 
why? May be it has internal own simulation of NOTIFY/LISTEN functionality?



Re: tcp keep alive don't work when the backend is busy

2019-12-11 Thread Олег Самойлов


> 10 дек. 2019 г., в 18:36, Justin  написал(а):
> 
> Hi Oner
> 
> It appears that you looking for a way to detect and kill of idle connections 
> or process that are running for a long time  Correct??

Nope, not idle. Only to stop an active connection if the client is already died.





PostgreSQL vs PostgresQL

2019-12-10 Thread Олег Самойлов
My vote to change official name form PostgreSQL to PostgresQL. The reason is: 
many young people look at PostgreSQL and think that the short name is Postgre. 
Sounded awfully, especially in Russian, tied to fix neighbours. With PostgresQL 
they will think that the short name is Postgres and this is good. Also will be 
opinion that PostgreQL = Postgres Query Language, this is also true, due to 
lack in Postgres many standard SQL features still.



PGUSER and initdb

2019-12-10 Thread Олег Самойлов
PGUSER doesn't work with initdb and pg_ctl initdb. Lets explain, where this can 
be important. For instance in MacOS the most convenient way to manage 
PostgresQL is homebrew. 

To install PostgresQL:

brew install postgresql

To upgrade:

brew upgrade postgresql

this will upgrade soft and to upgrade DB 

brew postgresql-upgrade-database

The later command do: install previous version of PostgresQL, move postgresql 
dir to postgesql.old, run pg_upgrade and so on. Convenient.

But this command don't have option to define name of superuser. The only 
variant to set name off superuser is to use PGUSER environment variable. But 
PGUSER understand only pg_upgrade, but not initdb.



Re: pg_restore with connection limit 0

2019-11-27 Thread Олег Самойлов


> 27 нояб. 2019 г., в 18:14, Adrian Klaver  
> написал(а):

> Why not use pg_hba.conf to allow only connection from superuser for duration?

We considered this, but we don't have direct access to the linux server, only 
sql connection. The application run in a kubernetes, etc.

The most simple approach to block concurrent access to the database, may be, is 
run all pg_restore inside single transaction. But this is impossible with 
--create option, but --create option is the only way to restore database 
settings. And this is impossible for us, because we need pg_restore in 3 stages.

It is inconvenient that the pg_restore has functionality for 3 stages, but 
don't has functionality to block database for time of this 3 stages.



pg_restore with connection limit 0

2019-11-27 Thread Олег Самойлов
Hi all.

I have task to refresh a test database from a production database (with 
masking) on the fly. To make masking we use pg_restore --create with three 
stages restoration. And one of the problem is daemons writing concurrently in 
the time of the restoration of a database. I need to block them in time of the 
restoration. ALLOW_CONNECTIONS=true don't work in this case, because it blocks 
a superuser too. We use CONNECTION LIMIT 0, because this blocks almost all, 
except a superuser. But to prevent a race condition we must set this in the 
same query as CREATE DATABASE, the postgresql syntax allow this. Now this is 
rather complex procedure:

We get `pg_restore --create --list −−section=pre−data` and `pg_restore ---stage 
--list −−section=pre−data`, diff them to get difference.
Get `pg_restore --create --use-list=diff −−section=pre−data` to get sql.
Edit sql: remove ALTER DATABASE ... CONNECTION LIMIT ... if exist, add 
CONNECTION LIMIT 0 to the CREATE DATABASE.
Send sql to the postgresql.
And so on.

Will be convenient to add CONNECTION LIMIT 0 to the CREATE DATABASE just by 
option of pg_restore.
And will be even more straight approach do not use CONNECTION LIMIT 0 is this 
case, but change ALLOW_CONNECTIONS to accept values: false, true, superuser. 
("Superuser" to accept connections from superuser only).



logging proxy

2019-10-31 Thread Олег Самойлов
Does anyone know PostgresQL proxy which can log queries with username, ip and 
affected rows for security reason. PostgresQL itself can log almost all, except 
affected rows.



Re: stable for each row before insert trigger

2019-10-18 Thread Олег Самойлов
Luca, I also read this section before ask the question.

> 18 окт. 2019 г., в 10:15, Tom Lane  написал(а):
> 
> =?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:
>> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
>> somehow useful with trigger functions, for instance mentioned that the AFTER 
>> INSERT trigger should be VOLATILE. The question is how this words affect a 
>> for each row before insert trigger? Can be some optimisation here?
> 
> Where did you read that?  There's no optimization that considers the
> volatility of trigger functions --- they'll be called exactly when
> specified, no more or less.

Good to see this. :) But there is somehow optimisation for triggers, which is 
somehow mentioned in the documentation, but not clearly defined.

https://www.postgresql.org/docs/current/sql-createfunction.html

> STABLE indicates  (It is inappropriate for AFTER triggers that wish to 
> query rows modified by the current command.)

So, STABLE is inappropriate for such trigger, but is appropriate for BEFORE 
trigger?

Luca correctly pointed to:
https://www.postgresql.org/docs/current/trigger-datachanges.html

> If your trigger function is written in any of the standard procedural 
> languages, then the above statements apply only if the function is declared 
> VOLATILE. Functions that are declared STABLE or IMMUTABLE will not see 
> changes made by the calling command in any case.

So will be good put inside right section

https://www.postgresql.org/docs/current/xfunc-volatility.html

Exact definition how "VOLATILE, STABLE, IMMUTABLE" affect a trigger function.

For instance, I expect that the FOR EACH ROW BEFORE trigger marked as STABLE 
will be faster than VOLATILE without important negative side effects. I 
observed 3% benefit. IMMUTABLE trigger is slightly slower then STABLE, but I am 
not sure, too low difference.






Re: stable for each row before insert trigger

2019-10-18 Thread Олег Самойлов
Eh, stupid answer. Of cause, I read the documentation. But what about you? Do 
you know what is a trigger function? (for each row before insert)

A trigger function don't have parameters, instead it get special variables. OLD 
and NEW for instance or TG_ARGV[].

If trigger function depends only on NEW is it IMMUTABLE? (With the same NEW it 
must return the same changed NEW). If trigger function makes SELECTs and change 
only NEW is it can be declared as STABLE? And etc.

> 18 окт. 2019 г., в 2:41, Adrian Klaver  написал(а):
> 
> On 10/17/19 4:31 PM, Олег Самойлов wrote:
>> Hi all.
>> According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is 
>> somehow useful with trigger functions, for instance mentioned that the AFTER 
>> INSERT trigger should be VOLATILE. The question is how this words affect a 
>> for each row before insert trigger? Can be some optimisation here?
> 
> https://www.postgresql.org/docs/11/xfunc-volatility.html
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com





stable for each row before insert trigger

2019-10-17 Thread Олег Самойлов
Hi all.

According to documentation the words "VOLATILE, STABLE, IMMUTABLE" is somehow 
useful with trigger functions, for instance mentioned that the AFTER INSERT 
trigger should be VOLATILE. The question is how this words affect a for each 
row before insert trigger? Can be some optimisation here?



wal receiver stops for 2 hour

2019-08-12 Thread Олег Самойлов
Hi all.

PostgeewSQL 11.4 on Centos 7.

I created a test bed on VirtualBox and test HA cluster by random failures in 
loop. Sometime, in case of longest switching from failure, I get strange 
behaviour. The walreceiver is stopped by timeout, but don't restarted for 2 
hours. May be I agree with stopping by timeout, but why postgresql wait for 2 
hour to start it again?

2019-08-12 16:34:31.118 MSK [1455] FATAL:  terminating walreceiver due to 
timeout
2019-08-12 16:34:31.119 MSK [1451] LOG:  record with incorrect prev-link 
DC7A2D84/100 at 0/D078A38
2019-08-12 18:34:50.222 MSK [14634] FATAL:  could not connect to the primary 
server: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
2019-08-12 18:34:50.234 MSK [8462] LOG:  fetching timeline history file for 
timeline 4 from primary server
2019-08-12 18:34:50.235 MSK [8462] LOG:  started streaming WAL from primary at 
0/D00 on timeline 3
2019-08-12 18:34:50.237 MSK [8462] LOG:  replication terminated by primary 
server
2019-08-12 18:34:50.237 MSK [8462] DETAIL:  End of WAL reached on timeline 3 at 
0/D078A38.
2019-08-12 18:34:50.238 MSK [1451] LOG:  new target timeline is 4
2019-08-12 18:34:50.239 MSK [8462] LOG:  restarted WAL streaming at 0/D00 
on timeline 4

May be reason is in restart_after_crash=off option (recommended fo HA clusters).

The postgresql config is default, except:

ident_file = '/var/lib/pgsql/pg_ident.conf'
hba_file = '/var/lib/pgsql/pg_hba.conf'
listen_addresses = '*'
log_filename = 'postgresql.%F.log'  # log file name pattern,
wal_keep_segments = 1
restart_after_crash = off

shared_buffers = 256MB
# may be this is a good compromise
synchronous_commit = remote_write
# other DC is the first, our is the last (this is different for each node)
synchronous_standby_names = 'FIRST 1 (tuchanka2a,tuchanka2c,tuchanka2b)'

And for slaves additionally:

primary_conninfo = 'host=krogan2 user=replicant application_name=tuchanka2d 
sslmode=disable'
recovery_target_timeline = 'latest'
standby_mode = 'on'



Re: PostgreSQL lock file

2019-08-12 Thread Олег Самойлов


> Best practice for this sort of thing is considered to be to run the two
> Postgres instances under different user IDs.  That avoids them thinking
> that the other one's lock files are relevant, and it provides an extra
> layer of security against possible conflicts between the two instances.
> 
>   regards, tom lane

Thanks, an interesting  notice. I already did changes in start script to check 
instance by pg_ctl status and delete lock file if status<>0. But your 
recommendation is looked better.



PostgreSQL lock file

2019-08-09 Thread Олег Самойлов
Hi all.

I have a task to create HA PostgreSQL cross datacenter cluster. I created a 
test bed on VirtualBox and I simulate different failures and disastrous in 
loops. And got one of funny bug of the Postgresql (11.4). One of the clusters 
has two instances of PostgreSQL on port 5433 and 5434. After simulating 
unexpected power off of the node the lock files of PostgreSQL still exist in 
/tmp directory. And on recovery the first instance on port 5433 starts without 
problem, but the second instance sometimes not. The reason is that the PID in 
/tmp/.s.PGSQL.5434.lock sometimes point to the one of process of PostgereSQL on 
port 5433, for instance the logger.

 1408 ?S  0:00 /usr/pgsql-11/bin/postgres -D /var/lib/pgsql/krogan0a
 1415 ?Ss 0:00  \_ postgres: logger
 1416 ?Ss 0:02  \_ postgres: startup   recovering 
00030007
 1417 ?Ss 0:00  \_ postgres: checkpointer
 1418 ?Ss 0:00  \_ postgres: background writer
 1419 ?Ss 0:00  \_ postgres: stats collector
 1420 ?Ss 0:10  \_ postgres: walreceiver   streaming 0/7269290

# cat /tmp/.s.PGSQL.5434.lock
1415
/var/lib/pgsql/krogan0b
1565355860
5434
/tmp

# sudo -u postgres /usr/pgsql-11/bin/pg_ctl -D /var/lib/pgsql/krogan0b start
waiting for server to start2019-08-09 16:52:25.685 MSK [6673] LOG:  
listening on IPv4 address "0.0.0.0", port 5434
2019-08-09 16:52:25.685 MSK [6673] LOG:  listening on IPv6 address "::", port 
5434
2019-08-09 16:52:25.686 MSK [6673] LOG:  listening on Unix socket 
"/var/run/postgresql/.s.PGSQL.5434"
2019-08-09 16:52:25.686 MSK [6673] FATAL:  lock file "/tmp/.s.PGSQL.5434.lock" 
already exists
2019-08-09 16:52:25.686 MSK [6673] HINT:  Is another postmaster (PID 1415) 
using socket file "/tmp/.s.PGSQL.5434"?
2019-08-09 16:52:25.686 MSK [6673] LOG:  database system is shut down
 stopped waiting
pg_ctl: could not start server
Examine the log output.

So it will be good if postgres will check not only PID, but also that the 
different postgres is running on different or same port. And is there some 
workaround?



Re: shared_buffers on Big RAM systems

2019-04-11 Thread Олег Самойлов
I tested. The shared buffers works better, then an OS level filesystem cache. 
The more shared_buffers (but less then database size), the better. With 
huge_pages is more better. But you must reserve enough free memory for OS and 
PostgeSQL itself.  

> 13 дек. 2018 г., в 18:17, Ron  написал(а):
> 
> Hi,
> 
> https://www.postgresql.org/docs/9.6/runtime-config-resource.html
> 
> The docs say, "If you have a dedicated database server with 1GB or more of 
> RAM, a reasonable starting value for shared_buffers is 25%".
> 
> But that's pretty archaic in 2018.  What if the dedicated database server has 
> 128GB RAM?
> 
> Thanks.
> 
> -- 
> Angular momentum makes the world go 'round.
> 





Re: Invoking user of the function with SECURITY DEFINER

2019-04-10 Thread Олег Самойлов
Looked like a bug.

> 25 нояб. 2018 г., в 14:50, Madan Kumar  написал(а):
> 
> Got it..
> In my case i was getting session_user in declare section and trying to 
> validate later which always resulted in the function owner.
>  DECLARE
>   user text := SESSION_USER;
> 
> So using it within the BEGIN; ...; END; clause worked for me.
> Thanks.
> 
> 
> Warm Regards,
> Madan Kumar K
>  





Re: New timeline when starting with a restored data dir

2019-04-10 Thread Олег Самойлов
It’s normal behaviour to create a new timeline, when a PostgreSQL slave become 
a new master. This is for what timelines were designed.

> 19 нояб. 2018 г., в 0:50, Yuri Kanivetsky  
> написал(а):
> 
> Hi,
> 
> I'm trying to set up Barman, and I've noticed that PostgreSQL 10
> creates a new timeline when starting with a restored data dir.
> Particularly, when WAL files are delivered via restore_command. When
> WAL files are copied to pg_wal dir, no new timeline is created. Is it
> to be expected? Can you explain?
> 
> I decided to ask here first, since you might know better how
> PostgreSQL operates.
> 
> Thanks in advance.
> 





Re: Performance of ByteA: ascii vs binary

2019-04-10 Thread Олег Самойлов
Just theoretically assumption. PostgreSQL sometimes may optimise internal 
format of data and can sometimes zip data. ASCII data can be zipped better, 
then binary random data. Also PostgreSQL sometimes take decision to keep a 
column in an external file, if the column is still too big after zip. I don’t 
know what exactly happens in your case, but here can be a reason.

> 18 марта 2019 г., в 17:33, Thomas Güttler  
> написал(а):
> 
> I did some benchmarking and in my setup there was major
> performance difference.
> 
> I tested a ByteA column.
> 
> If I used ascii data the tests took 52 seconds.
> If I used random binary data the test took 250 seconds.
> 
> binary data is (roughly) five times slower than ascii data?
> 
> Is this a know fact, or is there something wrong with my benchmark?
> 
> I used Python and psycopg2.
> 
> Regards,
>  Thomas Güttler
> 
> 
> -- 
> Thomas Guettler http://www.thomas-guettler.de/
> I am looking for feedback: https://github.com/guettli/programming-guidelines
> 





Re: PK and FK using Hash index

2019-04-10 Thread Олег Самойлов


> 22 марта 2019 г., в 22:38, PegoraroF10  написал(а):
> 
> On version 10 and 11 hash index was redesigned and now seems to be better
> than btree. 
> Their speed, size and memory consuption are better, so ... 
> Why is not possible to have all PK and FK just using hash indices ? The only
> thing I need on a PK and FK is search and join, so hash index responds very
> well.
> 
> I know a hash doesn´t accept unique index, so I cannot create a primary key
> on it. 
> But I think it would be great, isn´t it ?
> 
> There is something wrong with my thought ? 
> If not, there is a way to have PK/FH structure using Hash index ?
> 
> 
> 
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html
> 

The hash index is not good as expected. Some strange behaviour is observed even 
in 10 version.
https://www.postgresql.org/message-id/A841C4BC-A878-497E-AD9B-4DE0830DCC68%40ya.ru
Also btree keeps values inside an index and the uniqueness check can be made by 
index only scan. The hash index does not.
Also hash index don’t support multicolumn without an obvious reason for me. 
This can be done simple, just creates hash for every column and XOR it.

I think the hash index is perfect for a natural primary key (text base, for 
instance), but only theoretically. PostgreSQL implementation still has too many 
problems.



lost "left join"

2019-01-16 Thread Олег Самойлов
Hi, all.

I got some mystic behaviour of PostgreSQL, perhaps this is a bug.

Tables is a model of a simple graph where parents and children related as «many 
to many». And there is a field «valid» to validate for current moment.

set search_path = 'left_join';
begin;
drop schema if exists left_join cascade;
create schema left_join;
create table node (
   node_id integer primary key generated always as identity,
   valid daterange not null
);
create table link (
   parent integer not null references node(node_id),
   child integer not null references node(node_id),
   valid daterange not null,
   primary key (parent,child)
);
insert into node (node_id,valid) values 
(default,daterange('2019-01-01','2020-01-01'));
commit;

Now I want to get a one node and all children, if they exists:

=> explain select * from node as parent left join link on 
parent.node_id=link.parent left join node as child on link.child=child.node_id 
where parent.node_id=1;
QUERY PLAN
--
 Nested Loop Left Join  (cost=4.50..26.76 rows=1 width=112)
   ->  Nested Loop Left Join  (cost=4.35..21.91 rows=1 width=76)
 Join Filter: (parent.node_id = link.parent)
 ->  Index Scan using node_pkey on node parent  (cost=0.15..8.17 rows=1 
width=36)
   Index Cond: (node_id = 1)
 ->  Bitmap Heap Scan on link  (cost=4.20..13.67 rows=6 width=40)
   Recheck Cond: (parent = 1)
   ->  Bitmap Index Scan on link_pkey  (cost=0.00..4.20 rows=6 
width=0)
 Index Cond: (parent = 1)
   ->  Index Scan using node_pkey on node child  (cost=0.15..4.84 rows=1 
width=36)
 Index Cond: (link.child = node_id)
(11 rows)

All fine, there is «Left Join» in the planner. And the result is exactly what I 
want:

=> select * from node as parent left join link on parent.node_id=link.parent 
left join node as child on link.child=child.node_id where parent.node_id=1;
 node_id |  valid  | parent | child | valid | node_id | valid
-+-++---+---+-+---
   1 | [2019-01-01,2020-01-01) ||   |   | |
(1 row)

But things begin be strange if I add validation by time.

=> explain select * from node as parent left join link on 
parent.node_id=link.parent left join node as child on link.child=child.node_id 
where parent.node_id=1 and current_date <@ parent.valid and current_date <@ 
link.valid and current_date <@ child.valid;
QUERY PLAN
--
 Nested Loop  (cost=4.50..32.35 rows=1 width=112)
   ->  Nested Loop  (cost=4.35..21.88 rows=1 width=76)
 ->  Index Scan using node_pkey on node parent  (cost=0.15..8.18 rows=1 
width=36)
   Index Cond: (node_id = 1)
   Filter: (CURRENT_DATE <@ valid)
 ->  Bitmap Heap Scan on link  (cost=4.20..13.70 rows=1 width=40)
   Recheck Cond: (parent = 1)
   Filter: (CURRENT_DATE <@ valid)
   ->  Bitmap Index Scan on link_pkey  (cost=0.00..4.20 rows=6 
width=0)
 Index Cond: (parent = 1)
   ->  Index Scan using node_pkey on node child  (cost=0.15..8.18 rows=1 
width=36)
 Index Cond: (node_id = link.child)
 Filter: (CURRENT_DATE <@ valid)
(13 rows)

«Left Join»’s are lost. And in the result too:

=> select * from node as parent left join link on parent.node_id=link.parent 
left join node as child on link.child=child.node_id where parent.node_id=1 and 
current_date <@ parent.valid and current_date <@ link.valid and current_date <@ 
child.valid;
 node_id | valid | parent | child | valid | node_id | valid
-+---++---+---+-+---
(0 rows) 




Re: GIN Index for low cardinality

2018-11-14 Thread Олег Самойлов
Eh, I checked, you are right. Something terribly wrong with hash index in 
PostgreSQL. But there are another hash index gin(jsonb_path_ops), may be 
correctly say gin+hash index. Looked like it is the best for this purpose.

> 26 окт. 2018 г., в 19:27, Jeff Janes  написал(а):
> 
> On Thu, Oct 25, 2018 at 9:36 AM Олег Самойлов  <mailto:spl...@ya.ru>> wrote:
> 
>> 17 окт. 2018 г., в 13:46, Ravi Krishna > <mailto:srkrish...@aol.com>> написал(а):
>> 
>> In  
>> https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/
>>  
>> <https://www.cybertec-postgresql.com/en/ideas-for-scaling-postgresql-to-multi-terabyte-and-beyond/>
>> 
>> it is mentioned:
>> 
>> "GIN, the most know non-default index type perhaps, has been actually around 
>> for ages (full-text search) and in short is perfect for indexing columns 
>> where there are lot of repeating values – think all kinds of statuses or 
>> good old Mr/Mrs/Miss. GIN only stores every unique column value only once as 
>> for the default B-tree you’ll have e.g. 1 millon leaf nodes with the integer 
>> “1” in it."
>> 
>> 
>> Does it mean that GIN is a very good choice for low cardinality columns.  
> 
> Not necessary. There is other index which also don’t keep column value in an 
> every leaf. Hash, for instance. 
> 
> For smallish values (which low cardinality columns tend to be) the per-tuple 
> overhead and the pointer itself is probably much larger than the value, so 
> hash won't save you much if any space.  The GIN index removes not just the 
> value, but the per-tuple overhead.  And also compresses the point list to 
> further save space.  
> 
> Here is a real-world example from one of my databases where each value is 
> about 17 characters long, and is present about 20 times:
> 
> gin: 411 MB
> btree: 2167 MB 
> hash: 2159 MB
> 
> Cheers,
> 
> Jeff



Re: WTF with hash index?

2018-11-14 Thread Олег Самойлов
Ah, thanks. I am not a developer of PostgreSQL. I am a developer in PostgreSQL. 
:) And I see two hash indexes on the same data and one of them 43 times bigger 
then other, this looked like something terribly wrong. Just free idea how to 
considerably improve your product.

> 13 нояб. 2018 г., в 22:37, Alvaro Herrera  
> написал(а):
> 
> On 2018-Nov-13, Олег Самойлов wrote:
> 
>> Very much better. What about to copy paste algorithm from
>> gin(jsonb_path_ops) to the hash index?
> 
> You're welcome to submit patches.
> 
> -- 
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: WTF with hash index?

2018-11-13 Thread Олег Самойлов
I am just doing experiment what a type a most suitable for enumeration in 
PostgreSQL. And what index. And this effect looked for me very strange. There 
is in the PostgreSQL one another hash index. This is gin(jsonb_path_ops) for 
the jsob type. It is also use hash internally, but it is much better.
Example based on the previous example.

create table jender (jdoc jsonb);

insert into jender (jdoc) select ('{"gender": "'||gender||'"}')::jsonb from 
gender;

create index jender_hash on jender using gin (jdoc jsonb_path_ops);

=> \d+
   List of relations
 Schema |  Name  | Type  | Owner | Size  | Description
++---+---+---+-
 public | gender | table | olleg | 35 MB |
 public | jender | table | olleg | 54 MB |
(2 rows)

=> \di+
   List of relations
 Schema | Name | Type  | Owner | Table  |  Size   | Description
+--+---+---++-+-
 public | gender_btree | index | olleg | gender | 21 MB   |
 public | gender_hash  | index | olleg | gender | 47 MB   |
 public | jender_hash  | index | olleg | jender | 1104 kB |
(3 rows)

Very much better. What about to copy paste algorithm from gin(jsonb_path_ops) 
to the hash index?

WTF with hash index?

2018-11-13 Thread Олег Самойлов
CentOS 7

$ rpm -q postgresql10
postgresql10-10.6-1PGDG.rhel7.x86_64

SQL script for psql:

\set table_size 100
begin;
create table gender (gender varchar);

insert into gender (gender) select case when random<0.50 then 'female' when 
random<0.99 then 'male' else 'other' end from (select random() as random, 
generate_series(1,:table_size)) as subselect;

create index gender_btree on gender using btree (gender);
create index gender_hash on gender using hash (gender);
commit;
vacuum full analyze;

Vacuum full is not necessary here, just a little vodoo programming. I expected 
that the hash index will be much smaller and quicker than the btree index, 
because it doesn’t keep values inside itself, only hashes. But:

=> \d+
   List of relations
 Schema |  Name  | Type  | Owner | Size  | Description
++---+---+---+-
 public | gender | table | olleg | 35 MB |
(1 row)

=> \di+
  List of relations
 Schema | Name | Type  | Owner | Table  | Size  | Description
+--+---+---++---+-
 public | gender_btree | index | olleg | gender | 21 MB |
 public | gender_hash  | index | olleg | gender | 47 MB |
(2 rows)

The hash index not only is more than the btree index, but also is bigger than 
the table itself. What is wrong with the hash index?

Re: Is the centos repository for postgresql 10 is broken now?

2018-10-31 Thread Олег Самойлов
Looked like fixed now.




Is the centos repository for postgresql 10 is broken now?

2018-10-30 Thread Олег Самойлов
In this form 
https://www.postgresql.org/download/linux/redhat/ 

I set version 10, platform centos 7, arch x86_64 and I has got 

yum install 
https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/pgdg-centos10-10-2.noarch.rpm

I did this. And on yum update I get:

[root@tuchanka1a yum.repos.d]# yum update
Загружены модули: fastestmirror
Loading mirror speeds from cached hostfile
Including mirror: mirror.yandex.ru
 * base: mirror.yandex.ru
Including mirror: mirror.yandex.ru
 * extras: mirror.yandex.ru
Including mirror: mirror.yandex.ru
 * updates: mirror.yandex.ru
base| 3.6 kB 00:00
extras  | 3.4 kB 00:00
pgdg10  | 4.1 kB 00:00
updates | 3.4 kB 00:00
zabbix  | 2.9 kB 00:00
zabbix-non-supported|  951 B 00:00
pgdg10/7/x86_64/primary_db FAILED
http://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2:
 [Errno 14] HTTP Error 404 - Not Found
Пробуем другое зеркало.
To address this issue please refer to the below wiki article

https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use 
https://bugs.centos.org/.

pgdg10/7/x86_64/primary_db FAILED
http://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2:
 [Errno 14] HTTP Error 404 - Not Found
Пробуем другое зеркало.
http://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2:
 [Errno 14] HTTP Error 404 - Not Found
Пробуем другое зеркало.


 One of the configured repositories failed (PostgreSQL 10 7 - x86_64),
 and yum doesn't have enough cached data to continue. At this point the only
 safe thing yum can do is fail. There are a few ways to work "fix" this:

 1. Contact the upstream for the repository and get them to fix the problem.

 2. Reconfigure the baseurl/etc. for the repository, to point to a working
upstream. This is most often useful if you are using a newer
distribution release than is supported by the repository (and the
packages for the previous distribution release still work).

 3. Run the command with the repository temporarily disabled
yum --disablerepo=pgdg10 ...

 4. Disable the repository permanently, so yum won't use it by default. Yum
will then just ignore the repository until you permanently enable it
again or use --enablerepo for temporary usage:

yum-config-manager --disable pgdg10
or
subscription-manager repos --disable=pgdg10

 5. Configure the failing repository to be skipped, if it is unavailable.
Note that yum will try to contact the repo. when it runs most commands,
so will have to try and fail each time (and thus. yum will be be much
slower). If it is a very temporary problem though, this is often a nice
compromise:

yum-config-manager --save --setopt=pgdg10.skip_if_unavailable=true

failure: 
repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2
 from pgdg10: [Errno 256] No more mirrors to try.
http://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2:
 [Errno 14] HTTP Error 404 - Not Found
[root@tuchanka1a yum.repos.d]#

I checked, the file 
http://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64/repodata/d1a707282760737faba66b9507aca5d8616de42b883c7692ad09716ed9d5321c-primary.sqlite.bz2
 

 indeed abscent, but exists 
d4593f2aa6de4dcd80ca81de9e59429d00da390b6638d0f34de77cc3035bcc5e-primary.sqlite.bz2
 
.

Re: Strange behavior of the random() function

2018-10-26 Thread Олег Самойлов


> 26 сент. 2018 г., в 6:35, Tom Lane  написал(а):
> 
> r.zhar...@postgrespro.ru writes:
>> Can anybody explain me that strange behavior?
> 
> It's a squishiness in the SQL language, if you ask me.  Consider this
> simplified query:
> 
> select random() from generate_series(1, 3) order by random();
> 
> Would you expect the output of this query to appear ordered, or not?
> There's an argument to be made that the two instances of random() ought
> to be executed independently for each row, so that the output would
> (probably) seem unordered.  But practical usage generally expects that
> we unify the two textually-identical expressions, so that the behavior
> is the same as
> 
> select random() from generate_series(1, 3) order by 1;
> 
>>  select random() as "rnd", random() as "rnd1", random() as "rnd2"
>>  from generate_series( 1, 3 )
>>  order by random();
>> All values in any row are the same.
> 
> Here, we've unified *all* the appearances of the ORDER BY expression in
> the SELECT-list.  Maybe that's wrong, but it's hard to make a principled
> argument why we shouldn't do it.

Because random() is volatile function, but «unified» in such way can be only 
stable expressions. That the «volatile» function behaves like  the «stable» 
function this is obviously is a bug.


Re: Compile and build portable postgresql for mac

2018-10-25 Thread Олег Самойлов

> 25 окт. 2018 г., в 3:21, Pratik Parikh  написал(а):
> 
> Hi all,
> 
>  How to compile and build portable postgresql for mac.  Can someone point 
> me how the builds are done for general distribution.  A script would be 
> appreciated if anyone has it.  
> 
> Regards,
> Pratik

You can try 
https://postgresapp.com 
I don’t have experience much about this app. Looked like working, but psql 
demonstrate bugs in the text terminal ( iTerm and official Terminal).

Also you can try postgres from brew. It will compiled on your mac in this case. 

Re: BUG: Incorrect working with POSIX locale if database in UTF-8 encoding

2018-10-21 Thread Олег Самойлов
I think correct behavior will be get the whole locale from postgresql.conf 
(like the backend processes do) or from environment. It’s  a question, may be, 
from what place do take locale, but obviously from only one. But do not take 
LC_TYPE from the one place (postgresql.conf), while LC_MESSAGES from other 
(environment). Te bug is here.

> 18 окт. 2018 г., в 19:29, Tom Lane  написал(а):
> 
> =?utf-8?B?0J7Qu9C10LMg0KHQsNC80L7QudC70L7Qsg==?=  writes:
>> [ postmaster's localized messages are printed as garbage if LANG is C or 
>> unset ]
> 
> I'm not quite convinced that this is a bug.  The reason it's misbehaving
> is that in the postmaster process (and, probably, non-backend children)
> LC_MESSAGES gets set to whatever you said in postgresql.conf, but LC_CTYPE
> is never changed away from what it was in the postmaster's environment.
> So if the prevailing environment setting is C/POSIX, gettext() throws up
> its hands and substitutes "?" for non-ASCII characters, because it has
> no idea which encoding to render them in.
> 
> This is sort of intentional, in that the environment LC_CTYPE ought to
> reflect the "console encoding" that you're operating in; if you run your
> terminal in say KOI8R, then you set LC_CTYPE=ru_RU.koi8r and messages
> should get printed in the encoding the terminal is expecting.
> 
> We could maybe make a case for forcing gettext to use the encoding
> implied by LC_MESSAGES if LC_CTYPE is C/POSIX, but I'm not really
> convinced that there's anything principled about that.
> 
> On the other hand, the current behavior in this situation surely
> isn't useful to anybody.  Arguably, gettext() is being pretty
> unhelpful here, but I doubt we could get them to change.
> 
> Peter, any thoughts?
> 
>   regards, tom lane




Re: BUG: Incorrect working with POSIX locale if database in UTF-8 encoding

2018-10-17 Thread Олег Самойлов
Don’t agree. Pgsqlms just run pg_ctl with POSIX locale, this is not a bug. But 
unreadable messages when locale charset don’t match database charset is bug of 
pg_ctl. When messages come from within connection all fine, why don’t do the 
same for messages on start/stop?

> 17 окт. 2018 г., в 15:13, Adrian Klaver  
> написал(а):
> 
> On 10/17/18 2:29 AM, Олег Самойлов wrote:
>> I think this is a bug, because database encoding logging must work even in 
>> this case. The main problem is with pacemaker module pgsqlms, which launch 
>> pg_ctl in empty environment and thus with broken logging.
> 
> I suggest filing an issue here:
> 
> https://github.com/ClusterLabs/PAF/issues
> 
> 
> 
> 
> -- 
> Adrian Klaver
> adrian.kla...@aklaver.com




BUG: Incorrect working with POSIX locale if database in UTF-8 encoding

2018-10-17 Thread Олег Самойлов
I think this is a bug, because database encoding logging must work even in this 
case. The main problem is with pacemaker module pgsqlms, which launch pg_ctl in 
empty environment and thus with broken logging.
Let me explain on examples.
Empty databases:

-bash-4.2$ psql -p 5433 -l
  List of databases
   Name|  Owner   | Encoding |   Collate   |Ctype|   Access 
privileges
---+--+--+-+-+---
 postgres  | postgres | UTF8 | ru_RU.UTF-8 | ru_RU.UTF-8 |
 template0 | postgres | UTF8 | ru_RU.UTF-8 | ru_RU.UTF-8 | =c/postgres  
+
   |  |  | | | 
postgres=CTc/postgres
 template1 | postgres | UTF8 | ru_RU.UTF-8 | ru_RU.UTF-8 | =c/postgres  
+
   |  |  | | | 
postgres=CTc/postgres
(3 rows)

In empty environment:

-bash-4.2$ locale
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

-bash-4.2$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/krogan0a start
waiting for server to start2018-10-17 12:04:20.762 MSK [788] ?:  
??? ?? ??? ?? ?? IPv4 "0.0.0.0" ??  5433
2018-10-17 12:04:20.763 MSK [788] ?:  ??? ?? ??? ?? ?? 
IPv6 "::" ??  5433
2018-10-17 12:04:20.770 MSK [788] ?:  ??? ?? ??? ?? 
Unix-? "/var/run/postgresql/.s.PGSQL.5433"
2018-10-17 12:04:20.784 MSK [788] ?:  ??? ?? ??? ?? 
Unix-? "/tmp/.s.PGSQL.5433"
2018-10-17 12:04:20.811 MSK [788] ?:   ?? ?  
 ? ??
2018-10-17 12:04:20.811 MSK [788] ?:  ? ?? ? ? 
?? ? ??? "log".
 done

And log file also unreadable:

2018-10-17 11:56:57.328 MSK [579] ?:  ??? ??  ?: 
2018-10-17 11:55:41 MSK
2018-10-17 11:56:57.339 MSK [577] ?:  ??? ?? ?? ? 
???
2018-10-17 12:04:14.754 MSK [577] ?:  ??? ?? ?? ??? 
??
2018-10-17 12:04:14.808 MSK [577] ?:  ??   
??
2018-10-17 12:04:14.817 MSK [577] ?:  ??? ???: logical 
replication launcher (PID 586) ?? ? ? ?? 1
2018-10-17 12:04:14.817 MSK [580] ?:  ??
2018-10-17 12:04:16.930 MSK [577] ?:  ??? ?? ?
2018-10-17 12:04:20.821 MSK [790] ?:  ??? ??  ?: 
2018-10-17 12:04:15 MSK
2018-10-17 12:04:20.858 MSK [788] ?:  ??? ?? ?? ? 
???

But in a connection to database all fine:

-bash-4.2$ psql -p 5433
psql (10.5)
Type "help" for help.

postgres=# selectt;
ОШИБКА:  ошибка синтаксиса (примерное положение: "selectt")
LINE 1: selectt;
^

And in log file:

2018-10-17 12:05:08.048 MSK [801] ОШИБКА:  ошибка синтаксиса (примерное 
положение: "selectt") (символ 1)
2018-10-17 12:05:08.048 MSK [801] ОПЕРАТОР:  selectt;

And on stop:

-bash-4.2$ /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/krogan0a stop
waiting for server to shut down. done
server stopped

But log file:

2018-10-17 12:09:10.376 MSK [788] ?:  ??? ?? ?? ??? 
??
2018-10-17 12:09:10.384 MSK [788] ?:  ??   
??
2018-10-17 12:09:10.390 MSK [788] ?:  ??? ???: logical 
replication launcher (PID 797) ?? ? ? ?? 1
2018-10-17 12:09:10.390 MSK [791] ?:  ??
2018-10-17 12:09:12.365 MSK [788] ?:  ??? ?? ?

Work around is to set UTF-8 locale or LANGUAGE:

-bash-4.2$ LANG=ru_RU.UTF-8 /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/krogan0a 
start
ожидание запуска сервера2018-10-17 12:11:05.508 MSK [951] СООБЩЕНИЕ:  для 
приёма подключений по адресу IPv4 "0.0.0.0" открыт порт 5433
2018-10-17 12:11:05.508 MSK [951] СООБЩЕНИЕ:  для приёма подключений по адресу 
IPv6 "::" открыт порт 5433
2018-10-17 12:11:05.515 MSK [951] СООБЩЕНИЕ:  для приёма подключений открыт 
Unix-сокет "/var/run/postgresql/.s.PGSQL.5433"
2018-10-17 12:11:05.529 MSK [951] СООБЩЕНИЕ:  для приёма подключений открыт 
Unix-сокет "/tmp/.s.PGSQL.5433"
2018-10-17 12:11:05.553 MSK [951] СООБЩЕНИЕ:  передача вывода в протокол 
процессу сбора протоколов
2018-10-17 12:11:05.553 MSK [951] ПОДСКАЗКА:  В дальнейшем протоколы будут 
выводиться в каталог "log".
 готово
сервер запущен

log file:

2018-10-17 12:11:05.562 MSK [953] СООБЩЕНИЕ:  система БД была выключена: 
2018-10-17 12:09:10 MSK
2018-10-17 12:11:05.576 MSK [951] СООБЩЕНИЕ:  система БД готова принимать 
подключения

Or:

-bash-4.2$ LANG=en_US.UTF-8 /usr/pgsql-10/bin/pg_ctl -D /var/lib/pgsql/krogan0a 
start
waiting for server to start2018-10-17 12:12:35.182 MSK [1030] СООБЩЕНИЕ:  
для приёма 

Incorrect description of the WITH CHECK in the row security can lead to the security issue

2018-07-19 Thread Олег Самойлов
Hi all. This is a documentation issue, I already sent to pgsql-docs, but there 
is not reply.

https://www.postgresql.org/message-id/152637961531.27212.188002690528452...@wrigleys.postgresql.org

I'll try here. 

PostgreSQL 10 (in 11 the same
https://www.postgresql.org/docs/10/static/ddl-rowsecurity.html

cite
To use a different policy for rows that are being added to the table
compared to those rows that are visible, the WITH CHECK clause can be used.
This policy would allow all users to view all rows in the users table, but
only modify their own:

CREATE POLICY user_policy ON users
USING (true)
WITH CHECK (user_name = current_user);
end cite

This is is wrong description. Every one can steal other row with such
policy. Lets demonstrate.

You are now connected to database "olleg" as user "olleg".
olleg(at)[local]:9700/olleg
=> create table users (user_name text primary key, description text);
CREATE TABLE
olleg(at)[local]:9700/olleg
=> ALTER TABLE users ENABLE ROW LEVEL SECURITY;
ALTER TABLE
olleg(at)[local]:9700/olleg
=> grant all on users to public;
GRANT
=> CREATE POLICY user_policy ON users
-> USING (true)
-> WITH CHECK (user_name = current_user);
CREATE POLICY
olleg(at)[local]:9700/olleg
=> insert into users (user_name) values ('olleg');
INSERT 0 1
olleg(at)[local]:9700/olleg
=> set role postgres;
SET
olleg(at)[local]:9700/olleg
=# create user test with password 'test' login;
CREATE ROLE
=# \c olleg test localhost 9700
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384,
bits: 256, compression: off)
You are now connected to database "olleg" as user "test" on host "localhost"
at port "9700".
test(at)localhost:9700/olleg
=> select * from users;
 user_name | description 
---+-
 olleg | 
(1 row)

test(at)localhost:9700/olleg
=> update users set user_name='test', description='a rude hack';
UPDATE 1
test(at)localhost:9700/olleg
=> select * from users;
 user_name | description 
---+-
 test  | a rude hack
(1 row)

The right statement to not allow modify rows by other user will be

CREATE POLICY user_policy ON users
USING (user_name = current_user)
WITH CHECK (user_name = current_user);
end cite