Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres

2017-01-04 Thread Václav Ovsík
On Wed, Jan 04, 2017 at 04:50:11AM -0800, Alex Vandiver wrote:
>... 
> MySQL suffers from the exact same problem -- but, as it happens,
> both more silently and more catastrophically.  See
> https://github.com/bestpractical/rt/commit/e36364c5

Eh. I'm glad I did transition from Mysql to Postgres years ago :).

> > I can change isolation level in postgresql.conf to 'repeatable read'
> > and things are different.
> 
> I advise against doing that.  Upon inspection, RT is not prepared to
> deal with the "could not serialize access due to concurrent update"
> errors that arise from updates to rows in multiple transactions in
> Postgres' repeatable-read isolation.

OK, thanks!

> Repeatable-read is only possible in MySQL because it has a fascinating
> definition of "repeatable":
> ...
> > Should I change the default isolation level on Postgres for RT to
> > 'repeatable read'?
> 
> No.  You should try the 4.4/previewscrips-race branch, which I've just
> pushed:
> 
> https://github.com/bestpractical/rt/compare/4.4-trunk...4.4/previewscrips-race
> 
> The gory details are contained in the commits therein.

You have my respect! Wonderful job. I tried your changes on my test
instance first. The problem is solved I think.
I installed changes to production instance too.
Thanks for your time and effort!
-- 
Zito


Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres

2017-01-04 Thread Alex Vandiver
On Tue, 3 Jan 2017 17:06:47 +0100
Václav Ovsík  wrote:
> How about the Mysql don't have this problem - is this caused by
> the different default transaction isolation level or not?

MySQL suffers from the exact same problem -- but, as it happens,
both more silently and more catastrophically.  See
https://github.com/bestpractical/rt/commit/e36364c5

> I can change isolation level in postgresql.conf to 'repeatable read'
> and things are different.

I advise against doing that.  Upon inspection, RT is not prepared to
deal with the "could not serialize access due to concurrent update"
errors that arise from updates to rows in multiple transactions in
Postgres' repeatable-read isolation.

Repeatable-read is only possible in MySQL because it has a fascinating
definition of "repeatable":

- Process 1 
mysql> set transaction isolation level repeatable read;
Query OK, 0 rows affected (0.00 sec)

mysql> start transaction with consistent snapshot;
Query OK, 0 rows affected (0.00 sec)

mysql> select id, Subject from Tickets where id = 1;
++-+
| id | Subject |
++-+
|  1 | foo |
++-+
1 row in set (0.00 sec)

- Process 2 

mysql> set transaction isolation level repeatable read;
Query OK, 0 rows affected (0.00 sec)

mysql> start transaction with consistent snapshot;
Query OK, 0 rows affected (0.00 sec)

mysql> update Tickets set Subject = 'bar' where id = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

- Process 1 

mysql> select id, Subject from Tickets where id = 1;
++-+
| id | Subject |
++-+
|  1 | foo |
++-+
1 row in set (0.00 sec)

mysql> select id, Subject from Tickets where id = 1 FOR UPDATE;
++-+
| id | Subject |
++-+
|  1 | bar |
++-+
1 row in set (0.00 sec)



Contrast this with PostgreSQL, whose definition of repeatable read
acknowledges that fully consistent updates are not possible in all
cases:

- Process 1 
rt4=# start transaction;
START TRANSACTION
rt4=# set transaction isolation level repeatable read;
SET
rt4=# select id, Subject from Tickets where id = 1;
 id | subject 
+-
  1 | foo
(1 row)

- Process 2 
rt4=# start transaction;
START TRANSACTION
rt4=# set transaction isolation level repeatable read;
SET
rt4=# update Tickets set Subject = 'bar' where id = 1;
UPDATE 1
rt4=# commit;
COMMIT

- Process 1 
rt4=# select id, Subject from Tickets where id = 1;
 id | subject 
+-
  1 | foo
(1 row)

rt4=# select id, Subject from Tickets where id = 1 FOR UPDATE;
ERROR:  could not serialize access due to concurrent update



 ( Yes, MySQL requires SET TRANSACTION ISOLATION _outside_ the
   transaction, and PostgreSQL requires it to be _inside_.  See 
   https://dev.mysql.com/doc/refman/5.7/en/set-transaction.html
   https://www.postgresql.org/docs/9.1/static/sql-set-transaction.html )


> Should I change the default isolation level on Postgres for RT to
> 'repeatable read'?

No.  You should try the 4.4/previewscrips-race branch, which I've just
pushed:

https://github.com/bestpractical/rt/compare/4.4-trunk...4.4/previewscrips-race

The gory details are contained in the commits therein.
 - Alex


Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres

2017-01-03 Thread Václav Ovsík
On Mon, Jan 02, 2017 at 08:10:15PM -0800, Alex Vandiver wrote:
> On Mon, 2 Jan 2017 17:12:29 +0100
> Václav Ovsík  wrote:
> > Can anybody confirm on different system?
> 
> Thanks for the detailed replication instructions.  I can replicate, and
> have tracked down a minimal replication case.  I'll drop my findings
> and suggestion on your ticket.
> ...

Great, thank you very much for your time and effort!

How about the Mysql don't have this problem - is this caused by
the different default transaction isolation level or not?

I can change isolation level in postgresql.conf to 'repeatable read'
and things are different.

Should I change the default isolation level on Postgres for RT to
'repeatable read'?

-- 
Zito


Re: [rt-users] RT 4.4.1 and transaction isolation level on Postgres

2017-01-02 Thread Alex Vandiver
On Mon, 2 Jan 2017 17:12:29 +0100
Václav Ovsík  wrote:
> Can anybody confirm on different system?

Thanks for the detailed replication instructions.  I can replicate, and
have tracked down a minimal replication case.  I'll drop my findings
and suggestion on your ticket.

The short form is that this is due to 4.4's new PreviewScrips
functionality, which simulates (then rolls back) all of the changes,
which is racing with the actual change.  The bad news is that it's the
real change, not the dry run, which gets killed in the deadlock
detector, meaning that the owner does go unchanged.  At least this is
bubbled up to the user in the ticket display page, but this is still
quite unfortunate.

Amusingly, there are some strong parallels to the canonical Therac-25
case[1] -- the race here requires that one trigger a PreviewScrips very
quickly before submitting, which is rare except with users quite
accustomed to the UI.  In the Therac-25 case, only skilled users
could navigate to the bottom of the form within 8 seconds and thus
deliver lethal doses of radiation to their patients.

...RT is designed to be robust, but there's a reason it doesn't rate
itself as meant to be used for safety- or life-critical applications.

 - Alex


[1] https://en.wikipedia.org/wiki/Therac-25


[rt-users] RT 4.4.1 and transaction isolation level on Postgres

2017-01-02 Thread Václav Ovsík
Hi,
to reproduce the deadlock problem with RT 4.4.1 on Postgres I tried to install
RT 4.4.1 as clean new instalation:
1. On Debian 8.6 stable + Postgres 9.4
2. On Debian sid (unstable) + Postgres 9.6 (there was a need to
   change one sql command, option NOCREATEUSER -> NOCREATEROLE)

To my surprise I created ticket in queue General (default) and the bug
appeared immediately when I tried to change owner of a ticket and record
comment together.

Can anybody confirm on different system?

Steps:
1. Create Postgres cluster (server instance):

pg_createcluster 9.6 main --locale cs_CZ.UTF-8

2. Change access to rt4 database, so I don't need to set password for rt_user
   in /etc/postgresql/9.6/main/pg_hba.conf

diff --git a/postgresql/9.6/main/pg_hba.conf 
b/postgresql/9.6/main/pg_hba.conf
index b708a885..85857f64 100644
--- a/postgresql/9.6/main/pg_hba.conf
+++ b/postgresql/9.6/main/pg_hba.conf
@@ -86,6 +86,8 @@ local   all postgres  
  peer
 
 # TYPE  DATABASEUSERADDRESS METHOD
 
+local  rt4 rt_user trust
+
 # "local" is for Unix domain socket connections only
 local   all all peer
 # IPv4 local connections:

3. Start Postgres & check

pg_ctlcluster 9.6 main start
pg_lsclusters 

4. Install RT 4.4.1 from RT sources...

cd rt-4.4.1-sources
./configure --prefix=/opt/rt  --with-db-type=Pg
sudo make install

   Change db host, so the connection will go through unix domain socket

cd /opt/rt
cat <<'EOT' >etc/RT_SiteConfig.pm
Set($WebPort, );
Set($DatabaseHost, undef);
Set($DatabaseRTHost, undef);
1;
EOT

   Postgres user needs to read schame etc

chmod a+r etc/{acl,initialdata,schema}*

   Postgres 9.6 no longer support NOCREATEUSER, but NOTCREATEROLE
   (https://www.postgresql.org/docs/9.6/static/release-9-6.html)

sed -i -e 's/NOCREATEUSER/NOCREATEROLE/;' etc/acl.Pg

   Initialize database (if something went wrong, remove not finished database
   using 'sudo -u postgres dropdb rt4' and try better again)

sudo -u postgres /usr/bin/perl -I/opt/rt/local/lib -I/opt/rt/lib 
sbin/rt-setup-database --action init 

5. Start RT built int HTTP server by hand:

sudo -u www-data /opt/rt/sbin/rt-server --port 


6. Go to browser and type location localhost:

- login as root / password
- click on "New ticket in" General, enter something into subject and message
  and click "Create"
  (Owner is nobody, requestor is root)
- Go Actions / Comment, type something into message and change owner
  to root. Click "Update ticket"
- If owner was changed, try repeat previous step switching owner
  between nobody & root, until error.

My session:

bobek:/opt/rt# sudo -u www-data /opt/rt/sbin/rt-server --port 
[21132] [Mon Jan  2 15:58:50 2017] [info]: Successful login for root from 
127.0.0.1 (/opt/rt/sbin/../lib/RT/Interface/Web.pm:831)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 #1/31 - Scrip 7 On Create 
Autoreply To Requestors (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 sent  To: root@localhost 
(/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:316)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 #1/31 - Scrip 8 On Create 
Notify Owner and AdminCcs (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 No recipients found. Not 
sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 #1/31 - Scrip 9 On Create 
Notify Ccs (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 No recipients found. Not 
sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 #1/31 - Scrip 10 On Create 
Notify Other Recipients (/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:285)
[21133] [Mon Jan  2 16:00:21 2017] [info]: 
 No recipients found. Not 
sending. (/opt/rt/sbin/../lib/RT/Interface/Email.pm:806)
[21133] [Mon Jan  2 16:00:21 2017] [info]: Ticket 1 created in queue 'General' 
by root (/opt/rt/sbin/../lib/RT/Ticket.pm:567)
[21132] [Mon Jan  2 16:02:37 2017] [info]: 
 not sending to 
root@localhost, creator of the transaction, due to NotifyActor setting 
(/opt/rt/sbin/../lib/RT/Action/SendEmail.pm:901)
[21133] [Mon Jan  2 16:02:48 2017] [info]: