Re: Mark a transaction uncommittable

2023-07-03 Thread Daniel Gustafsson
> On 5 Jun 2023, at 09:22, Gurjeet Singh  wrote:

> Please see attached the patch that introduces this new feature.

This patch fails the guc_check test due to something missing in the sample
configuration.  Please send a rebased version with this test fixed.

[08:31:26.680] --- stdout ---
[08:31:26.680] # executing test in 
/tmp/cirrus-ci-build/build/testrun/test_misc/003_check_guc group test_misc test 
003_check_guc
[08:31:26.680] not ok 1 - no parameters missing from postgresql.conf.sample
[08:31:26.680] ok 2 - no parameters missing from guc_tables.c
[08:31:26.680] ok 3 - no parameters marked as NOT_IN_SAMPLE in 
postgresql.conf.sample
[08:31:26.680] 1..3
[08:31:26.680] # test failed

--
Daniel Gustafsson





Re: Mark a transaction uncommittable

2023-06-06 Thread Cary Huang
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   not tested
Documentation:not tested

Hello

It is one of those features that a handful of people would find useful in 
specific user cases. I think it is a nice to have feature that safeguards your 
production database against unwanted commits when troubleshooting production 
problems. Your patch applies fine on master and I am able to run a couple tests 
on it and it seems to do as described. I noticed that the patch has a 
per-session variable "default_transaction_committable" that could make all 
transaction committable or uncommittable on the session even without specifying 
"begin transaction not committable;" I am wondering if we should have a 
configurable default at all as I think it should always defaults to true and 
unchangable. If an user wants a uncommittable transaction, he/she will need to 
explicitly specify that during "begin". Having another option to change default 
behavior for all transactions may be a little unsafe, it is possible someone 
could purposely change this default to false on a production session that needs 
transactions to absolutely commit, causing damages there. 

thank you
Cary Huang
--
Highgo Software Canada
www.highgo.ca

Re: Mark a transaction uncommittable

2023-06-05 Thread Gurjeet Singh
On Sat, Apr 22, 2023 at 4:33 PM Julien Rouhaud  wrote:
>
> Hi,
>
> On Sat, Apr 22, 2023 at 12:53:23PM -0400, Isaac Morland wrote:
> >
> > I have an application for this: creating various dev/test versions of data
> > from production.
> >
> > Start by restoring a copy of production from backup. Then successively
> > create several altered versions of the data and save them to a place where
> > developers can pick them up. For example, you might have one version which
> > has all data old than 1 year deleted, and another where 99% of the
> > students/customers/whatever are deleted. Anonymization could also be
> > applied. This would give you realistic (because it ultimately originates
> > from production) test data.
> >
> > This could be done by starting a non-committable transaction, making the
> > adjustments, then doing a pg_dump in the same transaction (using --snapshot
> > to allow it to see that transaction). Then rollback, and repeat for the
> > other versions. This saves repeatedly restoring the (probably very large)
> > production data each time.
>
> There already are tools to handle those use cases.  Looks for instance at
> https://github.com/mla/pg_sample to backup a consistent subset of the data, or
> https://github.com/rjuju/pg_anonymize to transparently pg_dump (or
> interactively query) anonymized data.
>
> Both tool also works when connected on a physical standby, while trying to
> update data before dumping them wouldn't.

Like everything in life, there are pros and cons to every approach.

pg_anonymize is an extension that may not be installed on the database
you're working with. And pg_sample (and similar utilities) may not
have a way to extract or sanitize the exact data you want.

With this feature built into Postgres, you'd not need any external
utilities or extensions. The benefits of features built into Postgres
are that the users can come up with ways of leveraging such a feature
in future in a way that we can't envision today.

Best regards,
Gurjeet
http://Gurje.et




Re: Mark a transaction uncommittable

2023-06-05 Thread Gurjeet Singh
On Mon, Jun 5, 2023 at 12:32 AM Laurenz Albe  wrote:
>
> On Mon, 2023-06-05 at 00:22 -0700, Gurjeet Singh wrote:
> > On Sat, Apr 22, 2023 at 8:01 AM Gurjeet Singh  wrote:
> > >
> > > This is a proposal for a new transaction characteristic. I haven't
> > > written any code, yet, and am interested in hearing if others may find
> > > this feature useful.
> >
> > Please see attached the patch that introduces this new feature.
>
> Can you explain why *you* would find this feature useful?

The idea came to me while I was reading a blog post, where the author
had to go to great lengths to explain to the reader why the queries
would be disastrous, if run on production database, and that they
should run those queries inside a transaction, and they _must_
rollback the transaction.

Having written my fair share of tutorials, and blogs, I know how
helpful it would be to start a transaction that the reader can't
accidentally commit.

As others have noted in this thread, this feature can be useful in
other situations, as well, like when you're trying to export a
sanitized copy of a production database. Especially in such a
situation you do not want those sanitization operations to ever be
committed on the source database.

Best regards,
Gurjeet
http://Gurje.et




Re: Mark a transaction uncommittable

2023-06-05 Thread Laurenz Albe
On Mon, 2023-06-05 at 00:22 -0700, Gurjeet Singh wrote:
> On Sat, Apr 22, 2023 at 8:01 AM Gurjeet Singh  wrote:
> > 
> > This is a proposal for a new transaction characteristic. I haven't
> > written any code, yet, and am interested in hearing if others may find
> > this feature useful.
> 
> Please see attached the patch that introduces this new feature.

Can you explain why *you* would find this feature useful?

Yours,
Laurenz Albe




Re: Mark a transaction uncommittable

2023-06-05 Thread Gurjeet Singh
On Sat, Apr 22, 2023 at 8:01 AM Gurjeet Singh  wrote:
>
> This is a proposal for a new transaction characteristic. I haven't
> written any code, yet, and am interested in hearing if others may find
> this feature useful.

Please see attached the patch that introduces this new feature. The
patch includes all the code changes that I could foresee; that is, it
includes server changes as well as changes for psql's auto-completion.
The patch does not include doc changes, or any tests. Please see a
sample session at the end of the email, though.

The patch introduces a new keyword, COMMITTABLE, and uses that to
introduce the new transaction attribute via BEGIN [TRANSACTION] and
SET TRANSACTION commands. The new code follows the semantics of the
[NOT] DEFERRABLE attribute of a transaction almost exactly.

> Many a times we start a transaction that we never intend to commit;
> for example, for testing, or for EXPLAIN ANALYZE, or after detecting
> unexpected results but still interested in executing more commands
> without risking commit,  etc.
>
> A user would like to declare their intent to eventually abort the
> transaction as soon as possible, so that the transaction does not
> accidentally get committed.
>
> This feature would allow the user to mark a transaction such that it
> can never be committed. We must allow such marker to be placed when
> the transaction is being started, or while it's in progress.

The user can mark the transaction as uncommittable either when
starting the transaction, or while it is still in progress.

> Once marked uncommittable, do not allow the marker to be removed.
> Hence, once deemed uncommittable, the transaction cannot be committed,
> even intentionally. This protects against cases where one script
> includes another (e.g. psql's \i command), and the included script may
> have statements that turn this marker back on.

Although the patch implements this desired behavior from the initial
proposal, I'm no longer convinced of the need to prevent user from
re-enabling committability of the transaction.

> Any command that ends a transaction (END, COMMIT, ROLLBACK) must
> result in a rollback.
>
> All of these properties seem useful for savepoints, too. But I want to
> focus on just the top-level transactions, first.

The patch does not change any behaviour related to savepoints. Having
made it work for the top-level transaction, and having seen the
savepoint/subtransaction code as I came across it as I developed this
patch, I feel that it will be very tricky to implement this behavior
safely for savepoints. Moreover, having thought more about the
possible use cases, I don't think implementing uncommittability of
savepoints will be of much use in the real world.

> I feel like the BEGIN and SET TRANSACTION commands would be the right
> places to introduce this feature.
>
> BEGIN [ work | transaction ] [ [ NOT ] COMMITTABLE ];
> SET TRANSACTION [ [ NOT ] COMMITTABLE ];

I tried to avoid adding a new keyword (COMMITTABLE) to the grammar,
but could not think of a better alternative. E.g. DISALLOW COMMIT
sounds like a good alternative, but DISALLOW is currently not a
keyword, so this form doesn't buy us anything.

> I'm not yet sure if the COMMIT AND CHAIN command should carry this
> characteristic to the next transaction.

After a little consideration, in the spirit of POLA, I have not done
anything special to change the default behaviour of COMMIT/ROLLBACK
AND CHAIN.

Any feedback is welcome. Please see below an example session
demonstrating this feature.

postgres=# begin transaction committable;
BEGIN

postgres=# commit;
COMMIT

postgres=# begin transaction not committable;
BEGIN

postgres=# commit;
WARNING:  transaction is not committable
ROLLBACK

postgres=# begin transaction not committable;
BEGIN

postgres=# set transaction_committable = true;
-- for clarity, we may want to emit additional "WARNING:  cannot make
transaction committable", although the patch currently doesn't do so.
ERROR:  invalid value for parameter "transaction_committable": 1

postgres=# commit;
ROLLBACK

postgres=# begin transaction not committable;
BEGIN

postgres=# set transaction committable ;
-- for clarity, we may want to emit additional "WARNING:  cannot make
transaction committable", although the patch currently doesn't do so.
ERROR:  invalid value for parameter "transaction_committable": 1

postgres=# set transaction committable ;
ERROR:  current transaction is aborted, commands ignored until end of
transaction block

postgres=# commit;
ROLLBACK

Best regards,
Gurjeet
http://Gurje.et


v1-0001-Allow-user-to-mark-a-transaction-uncommittable.patch
Description: Binary data


Re: Mark a transaction uncommittable

2023-04-22 Thread Julien Rouhaud
Hi,

On Sat, Apr 22, 2023 at 12:53:23PM -0400, Isaac Morland wrote:
>
> I have an application for this: creating various dev/test versions of data
> from production.
>
> Start by restoring a copy of production from backup. Then successively
> create several altered versions of the data and save them to a place where
> developers can pick them up. For example, you might have one version which
> has all data old than 1 year deleted, and another where 99% of the
> students/customers/whatever are deleted. Anonymization could also be
> applied. This would give you realistic (because it ultimately originates
> from production) test data.
>
> This could be done by starting a non-committable transaction, making the
> adjustments, then doing a pg_dump in the same transaction (using --snapshot
> to allow it to see that transaction). Then rollback, and repeat for the
> other versions. This saves repeatedly restoring the (probably very large)
> production data each time.

There already are tools to handle those use cases.  Looks for instance at
https://github.com/mla/pg_sample to backup a consistent subset of the data, or
https://github.com/rjuju/pg_anonymize to transparently pg_dump (or
interactively query) anonymized data.

Both tool also works when connected on a physical standby, while trying to
update data before dumping them wouldn't.




Re: Mark a transaction uncommittable

2023-04-22 Thread Isaac Morland
On Sat, 22 Apr 2023 at 11:01, Gurjeet Singh  wrote:

> This is a proposal for a new transaction characteristic. I haven't
> written any code, yet, and am interested in hearing if others may find
> this feature useful.
>
> Many a times we start a transaction that we never intend to commit;
> for example, for testing, or for EXPLAIN ANALYZE, or after detecting
> unexpected results but still interested in executing more commands
> without risking commit,  etc.
>
> A user would like to declare their intent to eventually abort the
> transaction as soon as possible, so that the transaction does not
> accidentally get committed.
>

I have an application for this: creating various dev/test versions of data
from production.

Start by restoring a copy of production from backup. Then successively
create several altered versions of the data and save them to a place where
developers can pick them up. For example, you might have one version which
has all data old than 1 year deleted, and another where 99% of the
students/customers/whatever are deleted. Anonymization could also be
applied. This would give you realistic (because it ultimately originates
from production) test data.

This could be done by starting a non-committable transaction, making the
adjustments, then doing a pg_dump in the same transaction (using --snapshot
to allow it to see that transaction). Then rollback, and repeat for the
other versions. This saves repeatedly restoring the (probably very large)
production data each time.

What I’m not sure about is how long it takes to rollback a transaction. I'm
assuming that it’s very quick compared to restoring from backup.

It would be nice if pg_basebackup could also have the --snapshot option.


Mark a transaction uncommittable

2023-04-22 Thread Gurjeet Singh
This is a proposal for a new transaction characteristic. I haven't
written any code, yet, and am interested in hearing if others may find
this feature useful.

Many a times we start a transaction that we never intend to commit;
for example, for testing, or for EXPLAIN ANALYZE, or after detecting
unexpected results but still interested in executing more commands
without risking commit,  etc.

A user would like to declare their intent to eventually abort the
transaction as soon as possible, so that the transaction does not
accidentally get committed.

This feature would allow the user to mark a transaction such that it
can never be committed. We must allow such marker to be placed when
the transaction is being started, or while it's in progress.

Once marked uncommittable, do not allow the marker to be removed.
Hence, once deemed uncommittable, the transaction cannot be committed,
even intentionally. This protects against cases where one script
includes another (e.g. psql's \i command), and the included script may
have statements that turn this marker back on.

Any command that ends a transaction (END, COMMIT, ROLLBACK) must
result in a rollback.

All of these properties seem useful for savepoints, too. But I want to
focus on just the top-level transactions, first.

I feel like the BEGIN and SET TRANSACTION commands would be the right
places to introduce this feature.

BEGIN [ work | transaction ] [ [ NOT ] COMMITTABLE ];
SET TRANSACTION [ [ NOT ] COMMITTABLE ];

I'm not yet sure if the COMMIT AND CHAIN command should carry this
characteristic to the next transaction.

Thoughts?

Best regards,
Gurjeet https://Gurje.et
http://aws.amazon.com