Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres
On Wed, Jan 04, 2017 at 04:50:11AM -0800, Alex Vandiver wrote: >... > MySQL suffers from the exact same problem -- but, as it happens, > both more silently and more catastrophically. See > https://github.com/bestpractical/rt/commit/e36364c5 Eh. I'm glad I did transition from Mysql to Postgres years ago :). > > I can change isolation level in postgresql.conf to 'repeatable read' > > and things are different. > > I advise against doing that. Upon inspection, RT is not prepared to > deal with the "could not serialize access due to concurrent update" > errors that arise from updates to rows in multiple transactions in > Postgres' repeatable-read isolation. OK, thanks! > Repeatable-read is only possible in MySQL because it has a fascinating > definition of "repeatable": > ... > > Should I change the default isolation level on Postgres for RT to > > 'repeatable read'? > > No. You should try the 4.4/previewscrips-race branch, which I've just > pushed: > > https://github.com/bestpractical/rt/compare/4.4-trunk...4.4/previewscrips-race > > The gory details are contained in the commits therein. You have my respect! Wonderful job. I tried your changes on my test instance first. The problem is solved I think. I installed changes to production instance too. Thanks for your time and effort! -- Zito
Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres
On Tue, 3 Jan 2017 17:06:47 +0100 Václav Ovsík wrote: > How about the Mysql don't have this problem - is this caused by > the different default transaction isolation level or not? MySQL suffers from the exact same problem -- but, as it happens, both more silently and more catastrophically. See https://github.com/bestpractical/rt/commit/e36364c5 > I can change isolation level in postgresql.conf to 'repeatable read' > and things are different. I advise against doing that. Upon inspection, RT is not prepared to deal with the "could not serialize access due to concurrent update" errors that arise from updates to rows in multiple transactions in Postgres' repeatable-read isolation. Repeatable-read is only possible in MySQL because it has a fascinating definition of "repeatable": - Process 1 mysql> set transaction isolation level repeatable read; Query OK, 0 rows affected (0.00 sec) mysql> start transaction with consistent snapshot; Query OK, 0 rows affected (0.00 sec) mysql> select id, Subject from Tickets where id = 1; ++-+ | id | Subject | ++-+ | 1 | foo | ++-+ 1 row in set (0.00 sec) - Process 2 mysql> set transaction isolation level repeatable read; Query OK, 0 rows affected (0.00 sec) mysql> start transaction with consistent snapshot; Query OK, 0 rows affected (0.00 sec) mysql> update Tickets set Subject = 'bar' where id = 1; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0 mysql> commit; Query OK, 0 rows affected (0.00 sec) - Process 1 mysql> select id, Subject from Tickets where id = 1; ++-+ | id | Subject | ++-+ | 1 | foo | ++-+ 1 row in set (0.00 sec) mysql> select id, Subject from Tickets where id = 1 FOR UPDATE; ++-+ | id | Subject | ++-+ | 1 | bar | ++-+ 1 row in set (0.00 sec) Contrast this with PostgreSQL, whose definition of repeatable read acknowledges that fully consistent updates are not possible in all cases: - Process 1 rt4=# start transaction; START TRANSACTION rt4=# set transaction isolation level repeatable read; SET rt4=# select id, Subject from Tickets where id = 1; id | subject +- 1 | foo (1 row) - Process 2 rt4=# start transaction; START TRANSACTION rt4=# set transaction isolation level repeatable read; SET rt4=# update Tickets set Subject = 'bar' where id = 1; UPDATE 1 rt4=# commit; COMMIT - Process 1 rt4=# select id, Subject from Tickets where id = 1; id | subject +- 1 | foo (1 row) rt4=# select id, Subject from Tickets where id = 1 FOR UPDATE; ERROR: could not serialize access due to concurrent update ( Yes, MySQL requires SET TRANSACTION ISOLATION _outside_ the transaction, and PostgreSQL requires it to be _inside_. See https://dev.mysql.com/doc/refman/5.7/en/set-transaction.html https://www.postgresql.org/docs/9.1/static/sql-set-transaction.html ) > Should I change the default isolation level on Postgres for RT to > 'repeatable read'? No. You should try the 4.4/previewscrips-race branch, which I've just pushed: https://github.com/bestpractical/rt/compare/4.4-trunk...4.4/previewscrips-race The gory details are contained in the commits therein. - Alex
Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres
On Mon, Jan 02, 2017 at 08:10:15PM -0800, Alex Vandiver wrote: > On Mon, 2 Jan 2017 17:12:29 +0100 > Václav Ovsík wrote: > > Can anybody confirm on different system? > > Thanks for the detailed replication instructions. I can replicate, and > have tracked down a minimal replication case. I'll drop my findings > and suggestion on your ticket. > ... Great, thank you very much for your time and effort! How about the Mysql don't have this problem - is this caused by the different default transaction isolation level or not? I can change isolation level in postgresql.conf to 'repeatable read' and things are different. Should I change the default isolation level on Postgres for RT to 'repeatable read'? -- Zito
Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres
On Mon, 2 Jan 2017 17:12:29 +0100 Václav Ovsík wrote: > Can anybody confirm on different system? Thanks for the detailed replication instructions. I can replicate, and have tracked down a minimal replication case. I'll drop my findings and suggestion on your ticket. The short form is that this is due to 4.4's new PreviewScrips functionality, which simulates (then rolls back) all of the changes, which is racing with the actual change. The bad news is that it's the real change, not the dry run, which gets killed in the deadlock detector, meaning that the owner does go unchanged. At least this is bubbled up to the user in the ticket display page, but this is still quite unfortunate. Amusingly, there are some strong parallels to the canonical Therac-25 case[1] -- the race here requires that one trigger a PreviewScrips very quickly before submitting, which is rare except with users quite accustomed to the UI. In the Therac-25 case, only skilled users could navigate to the bottom of the form within 8 seconds and thus deliver lethal doses of radiation to their patients. ...RT is designed to be robust, but there's a reason it doesn't rate itself as meant to be used for safety- or life-critical applications. - Alex [1] https://en.wikipedia.org/wiki/Therac-25
[rt-users] RT 4.4.1 and transaction isolation level on Postgres
Hi, to reproduce the deadlock problem with RT 4.4.1 on Postgres I tried to install RT 4.4.1 as clean new instalation: 1. On Debian 8.6 stable + Postgres 9.4 2. On Debian sid (unstable) + Postgres 9.6 (there was a need to change one sql command, option NOCREATEUSER -> NOCREATEROLE) To my surprise I created ticket in queue General (default) and the bug appeared immediately when I tried to change owner of a ticket and record comment together. Can anybody confirm on different system? Steps: 1. Create Postgres cluster (server instance): pg_createcluster 9.6 main --locale cs_CZ.UTF-8 2. Change access to rt4 database, so I don't need to set password for rt_user in /etc/postgresql/9.6/main/pg_hba.conf diff --git a/postgresql/9.6/main/pg_hba.conf b/postgresql/9.6/main/pg_hba.conf index b708a885..85857f64 100644 --- a/postgresql/9.6/main/pg_hba.conf +++ b/postgresql/9.6/main/pg_hba.conf @@ -86,6 +86,8 @@ local all postgres peer # TYPE DATABASEUSERADDRESS METHOD +local rt4 rt_user trust + # "local" is for Unix domain socket connections only local all all peer # IPv4 local connections: 3. Start Postgres & check pg_ctlcluster 9.6 main start pg_lsclusters 4. Install RT 4.4.1 from RT sources... cd rt-4.4.1-sources ./configure --prefix=/opt/rt --with-db-type=Pg sudo make install Change db host, so the connection will go through unix domain socket cd /opt/rt cat <<'EOT' >etc/RT_SiteConfig.pm Set($WebPort, ); Set($DatabaseHost, undef); Set($DatabaseRTHost, undef); 1; EOT Postgres user needs to read schame etc chmod a+r etc/{acl,initialdata,schema}* Postgres 9.6 no longer support NOCREATEUSER, but NOTCREATEROLE (https://www.postgresql.org/docs/9.6/static/release-9-6.html) sed -i -e 's/NOCREATEUSER/NOCREATEROLE/;' etc/acl.Pg Initialize database (if something went wrong, remove not finished database using 'sudo -u postgres dropdb rt4' and try better again) sudo -u postgres /usr/bin/perl -I/opt/rt/local/lib -I/opt/rt/lib sbin/rt-setup-database --action init 5. Start RT built int HTTP server by hand: sudo -u www-data /opt/rt/sbin/rt-server --port 6. Go to browser and type location localhost: - login as root / password - click on "New ticket in" General, enter something into subject and message and click "Create" (Owner is nobody, requestor is root) - Go Actions / Comment, type something into message and change owner to root. Click "Update ticket" - If owner was changed, try repeat previous step switching owner between nobody & root, until error. My session: bobek:/opt/rt# sudo -u www-data /opt/rt/sbin/rt-server --port [21132] [Mon Jan 2 15:58:50 2017] [info]: Successful login for root from 127.0.0.1 (/opt/rt/sbin/../lib/RT/Interface/Web.pm:831) [21133] [Mon Jan 2 16:00:21 2017] [info]: #1/31 - Scrip 7 On Create Autoreply To Requestors (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285) [21133] [Mon Jan 2 16:00:21 2017] [info]: sent To: root@localhost (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:316) [21133] [Mon Jan 2 16:00:21 2017] [info]: #1/31 - Scrip 8 On Create Notify Owner and AdminCcs (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285) [21133] [Mon Jan 2 16:00:21 2017] [info]: No recipients found. Not sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806) [21133] [Mon Jan 2 16:00:21 2017] [info]: #1/31 - Scrip 9 On Create Notify Ccs (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285) [21133] [Mon Jan 2 16:00:21 2017] [info]: No recipients found. Not sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806) [21133] [Mon Jan 2 16:00:21 2017] [info]: #1/31 - Scrip 10 On Create Notify Other Recipients (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285) [21133] [Mon Jan 2 16:00:21 2017] [info]: No recipients found. Not sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806) [21133] [Mon Jan 2 16:00:21 2017] [info]: Ticket 1 created in queue 'General' by root (/opt/rt/sbin/../lib/RT/Ticket.pm:567) [21132] [Mon Jan 2 16:02:37 2017] [info]: not sending to root@localhost, creator of the transaction, due to NotifyActor setting (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:901) [21133] [Mon Jan 2 16:02:48 2017] [info]: not sending to root@localhost, creator of the transaction, due to NotifyActor setting (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:901) [21133] [Mon Jan 2 16:02:48 2017] [info]: #1/39 - Scrip 11 On Owner Change Notify Owner (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285) [21133] [Mon Jan 2 16:02:48 2017] [info]: No recipients found. Not sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806) [21133] [Mon Jan 2 16:03:19 2017] [info]: #1/45 - Scrip 1 On Comment Notify