Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-08-27 Thread Fabien COELHO


Hello,

Here is the third version of the patch for pgbench thanks to Fabien Coelho 
comments. As in the previous one, transactions with serialization and 
deadlock failures are rolled back and retried until they end successfully or 
their number of tries reaches maximum.


Here is some partial review.

Patch applies cleanly.

It compiles with warnings, please fix them:

  pgbench.c:2624:28: warning: ‘failure_status’ may be used uninitialized in 
this function
  pgbench.c:2697:34: warning: ‘command’ may be used uninitialized in this 
function

I do not think that the error handling feature needs preeminence in the
final report, compare to scale, number of clients and so. The number
of tries should be put further on.

I would spell "number of tries" instead of "tries number" which seems to
suggest that each try is attributed a number. "sql" -> "SQL".

For the per statement latency final report, I do not think it is worth 
distinguishing the kind of retry at this level, because ISTM that 
serialization & deadlocks are unlikely to appear simultaneously. I would 
just report total failures and total tries on this report. We only have 2 
errors now, but if more are added I'm pretty sure that we would not want 
to have more columns... Moreover the 25 characters alignment is ugly, 
better use a much smaller alignment.


I'm okay with having details shown in the "log to file" group report.
The documentation does not seem consistent. It discusses "the very last fields"
and seem to suggest that there are two, but the example trace below just
adds one field.

If you want a paragraph you should add , skipping a line does not
work (around "All values are computed for ...").

I do not understand the second note of the --max-tries documentation.
It seems to suggest that some script may not end their own transaction...
which should be an error in my opinion? Some explanations would be welcome.

I'm not sure that "Retries" deserves a type of its own for two counters.
The "retries" in RetriesState may be redundant with these.
The failures are counted on simple counters while retries have a type,
this is not consistent. I suggest to just use simple counters everywhere.

I'm ok with having the detail report tell about failures & retries only
when some occured.

typo: sucessufully -> successfully

If a native English speaker could provide an opinion on that, and more
generally review the whole documentation, it would be great.

I think that the rand functions should really take a random_state pointer
argument, not a Thread or Client.

I'm at odds that FailureStatus does not have a clean NO_FAILURE state,
and that it is merged with misc failures.

I'm not sure that initRetries, mergeRetries, getAllRetries really
deserve a function.

I do not thing that there should be two accum Functions. Just extend
the existing one, and adding zero to zero is not a problem.

I guess that in the end pgbench & psql variables will have to be merged
if pgbench expression engine is to be used by psql as well, but this is
not linked to this patch.

The tap tests seems over-complicated and heavy with two pgbench run in
parallel... I'm not sure we really want all that complexity for this
somehow small feature. Moreover pgbench can run several scripts, I'm not
sure why two pgbench would need to be invoked. Could something much
simpler and lighter be proposed instead to test the feature?

The added code does not conform to Pg C style. For instance, if brace
should be aligned to the if. Please conform the project style.

The is_transaction_block_end seems simplistic. ISTM that it would not
work with compound commands. It should be clearly documented somewhere.

Also find attached two scripts I used for some testing:

  psql < dl_init.sql
  pgbench -f dl_trans.sql -c 8 -T 10 -P 1

--
Fabien.

dl_init.sql
Description: application/sql


dl_trans.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-08-16 Thread Marina Polyakova

Hi,


Hello!


Just had a need for this feature, and took this to a short test
drive. So some comments:
- it'd be useful to display a retry percentage of all transactions,
  similar to what's displayed for failed transactions.



- it'd be useful to also conveniently display the number of retried
  transactions, rather than the total number of retries.


Ok!


- it appears that we now unconditionally do not disregard a connection
  after a serialization / deadlock failure. Good. But that's useful far
  beyond just deadlocks / serialization errors, and should probably be 
exposed.


I agree that it will be useful. But how do you propose to print the 
results if there are many types of errors? I'm afraid that the progress 
report can be very long although it is expected that it will be rather 
short [1]. The per statement report can also be very long..



Nice feature!


Thanks and thank you for your comments :)

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121142300.12795%40lancre


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-08-16 Thread Alexander Korotkov
On Fri, Aug 11, 2017 at 10:50 PM, Andres Freund  wrote:

> On 2017-07-21 19:32:02 +0300, Marina Polyakova wrote:
> > Here is the third version of the patch for pgbench thanks to Fabien
> Coelho
> > comments. As in the previous one, transactions with serialization and
> > deadlock failures are rolled back and retried until they end
> successfully or
> > their number of tries reaches maximum.
>
> Just had a need for this feature, and took this to a short test
> drive. So some comments:
> - it'd be useful to display a retry percentage of all transactions,
>   similar to what's displayed for failed transactions.
> - it appears that we now unconditionally do not disregard a connection
>   after a serialization / deadlock failure. Good. But that's useful far
>   beyond just deadlocks / serialization errors, and should probably be
> exposed.
>

Yes, it would be nice to don't disregard a connection after other errors
too.  However, I'm not sure if we should retry the *same* transaction on
errors beyond deadlocks / serialization errors.  For example, in case of
division by zero or unique violation error it would be more natural to give
up with current transaction and continue with next one.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-08-11 Thread Andres Freund
Hi,

On 2017-07-21 19:32:02 +0300, Marina Polyakova wrote:
> Here is the third version of the patch for pgbench thanks to Fabien Coelho
> comments. As in the previous one, transactions with serialization and
> deadlock failures are rolled back and retried until they end successfully or
> their number of tries reaches maximum.

Just had a need for this feature, and took this to a short test
drive. So some comments:
- it'd be useful to display a retry percentage of all transactions,
  similar to what's displayed for failed transactions.
- it appears that we now unconditionally do not disregard a connection
  after a serialization / deadlock failure. Good. But that's useful far
  beyond just deadlocks / serialization errors, and should probably be exposed.
- it'd be useful to also conveniently display the number of retried
  transactions, rather than the total number of retries.

Nice feature!

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-21 Thread Marina Polyakova

Hello again!

Here is the third version of the patch for pgbench thanks to Fabien 
Coelho comments. As in the previous one, transactions with serialization 
and deadlock failures are rolled back and retried until they end 
successfully or their number of tries reaches maximum.


Differences from the previous version:
* Some code cleanup :) In particular, the Variables structure for 
managing client variables and only one new tap tests file (as they were 
recommended here [1] and here [2]).
* There's no error if the last transaction in the script is not 
completed. But the transactions started in the previous scripts and/or 
not ending in the current script, are not rolled back and retried after 
the failure. Such script try is reported as failed because it contains a 
failure that was not rolled back and retried.
* Usually the retries and/or failures are printed if they are not equal 
to zeros. In transaction/aggregation logs the failures are always 
printed and the retries are printed if max_tries is greater than 1. It 
is done for the general format of the log during the execution of the 
program.


Patch is attached. Any suggestions are welcome!

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121338090.12795%40lancre
[2] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121142300.12795%40lancre


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 0ee372e93b8d7017563d2f2f55357c39c08a Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Fri, 21 Jul 2017 17:57:58 +0300
Subject: [PATCH v3] Pgbench Retry transactions with serialization or deadlock
 errors

Now transactions with serialization or deadlock failures can be rolled back and
retried again and again until they end successfully or their number of tries
reaches maximum. You can set the maximum number of tries by using the
appropriate benchmarking option (--max-tries). The default value is 1. If
there're retries and/or failures their statistics are printed in the progress,
in the transaction / aggregation logs and in the end with other results (all and
for each script). A transaction failure is reported here only if the last try of
this transaction fails. Also retries and/or failures are printed per-command
with average latencies if you use the appropriate benchmarking option
(--report-per-command, -r) and the total number of retries and/or failures is
not zero.

Note that the transactions started in the previous scripts and/or not ending in
the current script, are not rolled back and retried after the failure. Such
script try is reported as failed because it contains a failure that was not
rolled back and retried.
---
 doc/src/sgml/ref/pgbench.sgml  | 240 +-
 src/bin/pgbench/pgbench.c  | 872 +
 .../t/002_serialization_and_deadlock_failures.pl   | 459 +++
 3 files changed, 1412 insertions(+), 159 deletions(-)
 create mode 100644 src/bin/pgbench/t/002_serialization_and_deadlock_failures.pl

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 64b043b..3bbeec5 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -49,6 +49,7 @@
 
 
 transaction type: builtin: TPC-B (sort of)
+transaction maximum tries number: 1
 scaling factor: 10
 query mode: simple
 number of clients: 10
@@ -59,7 +60,7 @@ tps = 85.184871 (including connections establishing)
 tps = 85.296346 (excluding connections establishing)
 
 
-  The first six lines report some of the most important parameter
+  The first seven lines report some of the most important parameter
   settings.  The next line reports the number of transactions completed
   and intended (the latter being just the product of number of clients
   and number of transactions per client); these will be equal unless the run
@@ -436,22 +437,33 @@ pgbench  options  dbname
 Show progress report every sec seconds.  The report
 includes the time since the beginning of the run, the tps since the
 last report, and the transaction latency average and standard
-deviation since the last report.  Under throttling (-R),
-the latency is computed with respect to the transaction scheduled
-start time, not the actual transaction beginning time, thus it also
-includes the average schedule lag time.
+deviation since the last report.  If since the last report there are
+transactions that ended with serialization/deadlock failures they are
+also reported here as failed (see
+ for more information).  Under
+throttling (-R), the latency is computed with respect to the
+transaction scheduled start time, not the actual transaction beginning
+time, thus it also includes the average schedule lag time.  If since the
+last report there're transactions that have been rolled back and retried
+  

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

Well, the short version may be to only do a full transaction retry and
to document that for now savepoints are not handled, and to let that
for future work if need arises.


I agree with you.


For progress the output must be short and readable, and probably we do
not care about whether retries came from this or that, so I would let
that out.

For log and aggregated log possibly that would make more sense, but it
must stay easy to parse.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

Ok, fine. My point was just to check before proceeding.


And I'm very grateful for that :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Fabien COELHO


And I'm not sure that we should do all the stuff for savepoints rollbacks 
because:

- as I see it now it only makes sense for the deadlock failures;
- if there's a failure what savepoint we should rollback to and start the 
execution again?


ISTM that it is the point of having savepoint in the first place, the 
ability to restart the transaction at that point if something failed?


Maybe to go to the last one, if it is not successful go to the previous 
one etc. Retrying the entire transaction may take less time..


Well, I do not know that. My 0.02 € is that if there was a savepoint then 
this is natural the restarting point of a transaction which has some 
recoverable error.


Well, the short version may be to only do a full transaction retry and to 
document that for now savepoints are not handled, and to let that for 
future work if need arises.



Maybe something like:
  ...
  number of failures: 12 (0.004%)
  number of retries: 64 (deadlocks: 29, serialization: 35)


Ok! How to you like the idea to use the same format (the total number of 
transactions with failures and the number of retries for each failure type) 
in other places (log, aggregation log, progress) if the values are not 
"default" (= no failures and no retries)?


For progress the output must be short and readable, and probably we do not 
care about whether retries came from this or that, so I would let that 
out.


For log and aggregated log possibly that would make more sense, but it 
must stay easy to parse.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Fabien COELHO



If the answer is no, then implement something in pgbench directly.


The structure of variables is different, the container structure of the 
variables is different, so I think that the answer is no.


Ok, fine. My point was just to check before proceeding.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

Given that the number of variables of a pgbench script is expected to
be pretty small, I'm not sure that the sorting stuff is worth the
effort.


I think it is a good insurance if there're many variables..


My suggestion is really to look at both implementations and to answer
the question "should pgbench share its variable implementation with
psql?".

If the answer is yes, then the relevant part of the implementation
should be moved to fe_utils, and that's it.

If the answer is no, then implement something in pgbench directly.


The structure of variables is different, the container structure of the 
variables is different, so I think that the answer is no.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

Not necessarily? It depends on where the locks triggering the issue
are set, if they are all set after the savepoint it could work on a
second attempt.


Don't you mean the deadlock failures where can really help rollback to


Yes, I mean deadlock failures can rollback to a savepoint and work on
a second attempt.

And could you, please, give an example where a rollback to savepoint 
can help to end its subtransaction successfully after a serialization 
failure?


I do not know whether this is possible with about serialization 
failures.
It might be if the stuff before and after the savepoint are somehow 
unrelated...


If you mean, for example, the updates of different tables - a rollback 
to savepoint doesn't help.


And I'm not sure that we should do all the stuff for savepoints 
rollbacks because:

- as I see it now it only makes sense for the deadlock failures;
- if there's a failure what savepoint we should rollback to and start 
the execution again? Maybe to go to the last one, if it is not 
successful go to the previous one etc.

Retrying the entire transaction may take less time..

[...] I mean that the sum of transactions with serialization failure 
and transactions with deadlock failure can be greater then the totally 
sum of transactions with failures.


Hmmm. Ok.

A "failure" is a transaction (in the sense of pgbench) that could not
made it to the end, even after retries. If there is a rollback and the
a retry which works, it is not a failure.

Now deadlock or serialization errors, which trigger retries, are worth
counting as well, although they are not "failures". So my format
proposal was over optimistic, and the number of deadlocks and
serializations should better be on a retry count line.

Maybe something like:
  ...
  number of failures: 12 (0.004%)
  number of retries: 64 (deadlocks: 29, serialization: 35)


Ok! How to you like the idea to use the same format (the total number of 
transactions with failures and the number of retries for each failure 
type) in other places (log, aggregation log, progress) if the values are 
not "default" (= no failures and no retries)?


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Fabien COELHO


Note that there is something for psql (src/bin/psql/variable.c) which 
may or may not be shared. It should be checked before recoding 
eventually the same thing.


Thank you very much for pointing this file! As I checked this is another 
structure: here there's a simple list, while in pgbench we should know 
if the list is sorted and the number of elements in the list. How do you 
think, is it a good idea to name a variables structure in pgbench in the 
same way (VariableSpace) or it should be different not to be confused 
(Variables, for example)?


Given that the number of variables of a pgbench script is expected to be 
pretty small, I'm not sure that the sorting stuff is worth the effort.


My suggestion is really to look at both implementations and to answer the 
question "should pgbench share its variable implementation with psql?".


If the answer is yes, then the relevant part of the implementation should 
be moved to fe_utils, and that's it.


If the answer is no, then implement something in pgbench directly.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Fabien COELHO


Hello Marina,


Not necessarily? It depends on where the locks triggering the issue
are set, if they are all set after the savepoint it could work on a
second attempt.


Don't you mean the deadlock failures where can really help rollback to


Yes, I mean deadlock failures can rollback to a savepoint and work on a 
second attempt.


And could you, please, give an example where a rollback to savepoint can 
help to end its subtransaction successfully after a serialization 
failure?


I do not know whether this is possible with about serialization failures.
It might be if the stuff before and after the savepoint are somehow 
unrelated...


[...] I mean that the sum of transactions with serialization failure and 
transactions with deadlock failure can be greater then the totally sum 
of transactions with failures.


Hmmm. Ok.

A "failure" is a transaction (in the sense of pgbench) that could not made 
it to the end, even after retries. If there is a rollback and the a retry 
which works, it is not a failure.


Now deadlock or serialization errors, which trigger retries, are worth 
counting as well, although they are not "failures". So my format proposal 
was over optimistic, and the number of deadlocks and serializations should 
better be on a retry count line.


Maybe something like:
  ...
  number of failures: 12 (0.004%)
  number of retries: 64 (deadlocks: 29, serialization: 35)

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

I was not convinced by the overall memory management around variables
to begin with, and it is even less so with their new copy management.
Maybe having a clean "Variables" data structure could help improve 
the

situation.


Note that there is something for psql (src/bin/psql/variable.c) which
may or may not be shared. It should be checked before recoding
eventually the same thing.


Thank you very much for pointing this file! As I checked this is another 
structure: here there's a simple list, while in pgbench we should know 
if the list is sorted and the number of elements in the list. How do you 
think, is it a good idea to name a variables structure in pgbench in the 
same way (VariableSpace) or it should be different not to be confused 
(Variables, for example)?


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-14 Thread Marina Polyakova

On 13-07-2017 19:32, Fabien COELHO wrote:

Hello,


Hi!

[...] I didn't make rollbacks to savepoints after the failure because 
they cannot help for serialization failures at all: after rollback to 
savepoint a new attempt will be always unsuccessful.


Not necessarily? It depends on where the locks triggering the issue
are set, if they are all set after the savepoint it could work on a
second attempt.


Don't you mean the deadlock failures where can really help rollback to 
savepoint? And could you, please, give an example where a rollback to 
savepoint can help to end its subtransaction successfully after a 
serialization failure?


"SimpleStats attempts": I disagree with using this floating poiunt 
oriented structures to count integers. I would suggest "int64 tries" 
instead, which should be enough for the purpose.


I'm not sure that it is enough. Firstly it may be several transactions 
in script so to count the average attempts number you should know the 
total number of runned transactions. Secondly I think that stddev for 
attempts number can be quite interesting and often it is not close to 
zero.


I would prefer to have a real motivation to add this complexity in the
report and in the code. Without that, a simple int seems better for
now. It can be improved later if the need really arises.


Ok!


Some variables, such as "int attempt_number", should be in the client
structure, not in the client? Generally, try to use block variables 
if
possible to keep the state clearly disjoints. If there could be NO 
new
variable at the doCustom level that would be great, because that 
would
ensure that there is no machine state mixup hidden in these 
variables.


Do you mean the code cleanup for doCustom function? Because if I do so 
there will be two code styles for state blocks and their variables in 
this function..


I think that any variable shared between state is a recipee for bugs
if it is not reset properly, so they should be avoided. Maybe there
are already too many of them, then too bad, not a reason to add more.
The status before the automaton was a nightmare.


Ok!


I would suggest a more compact one-line report about failures:

  "number of failures: 12 (0.001%, deadlock: 7, serialization: 5)"


I think, there may be a misunderstanding. Because script can contain 
several transactions and get both failures.


I do not understand. Both failures number are on the compact line I 
suggested.


I mean that the sum of transactions with serialization failure and 
transactions with deadlock failure can be greater then the totally sum 
of transactions with failures. But if you think it's ok I'll change it 
and write the appropriate note in documentation.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-13 Thread Fabien COELHO



I was not convinced by the overall memory management around variables
to begin with, and it is even less so with their new copy management.
Maybe having a clean "Variables" data structure could help improve the
situation.


Ok!


Note that there is something for psql (src/bin/psql/variable.c) which may 
or may not be shared. It should be checked before recoding eventually the 
same thing.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-13 Thread Fabien COELHO


Hello,

[...] I didn't make rollbacks to savepoints after the failure because 
they cannot help for serialization failures at all: after rollback to 
savepoint a new attempt will be always unsuccessful.


Not necessarily? It depends on where the locks triggering the issue are 
set, if they are all set after the savepoint it could work on a second 
attempt.


"SimpleStats attempts": I disagree with using this floating poiunt 
oriented structures to count integers. I would suggest "int64 tries" 
instead, which should be enough for the purpose.


I'm not sure that it is enough. Firstly it may be several transactions in 
script so to count the average attempts number you should know the total 
number of runned transactions. Secondly I think that stddev for attempts 
number can be quite interesting and often it is not close to zero.


I would prefer to have a real motivation to add this complexity in the 
report and in the code. Without that, a simple int seems better for now. 
It can be improved later if the need really arises.



Some variables, such as "int attempt_number", should be in the client
structure, not in the client? Generally, try to use block variables if
possible to keep the state clearly disjoints. If there could be NO new
variable at the doCustom level that would be great, because that would
ensure that there is no machine state mixup hidden in these variables.


Do you mean the code cleanup for doCustom function? Because if I do so there 
will be two code styles for state blocks and their variables in this 
function..


I think that any variable shared between state is a recipee for bugs if it 
is not reset properly, so they should be avoided. Maybe there are already 
too many of them, then too bad, not a reason to add more. The status 
before the automaton was a nightmare.



I wondering whether the RETRY & FAILURE states could/should be merged:


I divided these states because if there's a failed transaction block you 
should end it before retrying.


Hmmm. Maybe I'm wrong. I'll think about it.


I would suggest a more compact one-line report about failures:

  "number of failures: 12 (0.001%, deadlock: 7, serialization: 5)"


I think, there may be a misunderstanding. Because script can contain several 
transactions and get both failures.


I do not understand. Both failures number are on the compact line I 
suggested.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-13 Thread Marina Polyakova

Another detail I forgot about this point: there may be a memory leak
on variables copies, ISTM that the "variables" array is never freed.

I was not convinced by the overall memory management around variables
to begin with, and it is even less so with their new copy management.
Maybe having a clean "Variables" data structure could help improve the
situation.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-13 Thread Marina Polyakova

Here are a round of comments on the current version of the patch:


Thank you very much again!


There is a latent issue about what is a transaction. For pgbench a
transaction is a full script execution.
For postgresql, it is a statement or a BEGIN/END block, several of
which may appear in a script. From a retry
perspective, you may retry from a SAVEPOINT within a BEGIN/END
block... I'm not sure how to make general sense
of all this, so this is just a comment without attached action for now.


Yes it is. That's why I wrote several notes about it in documentation 
where there may be a misunderstanding:


+Transactions with serialization or deadlock failures (or with 
both

+of them if used script contains several transactions; see
++endterm="transactions-and-scripts-title"> for more information) 
are

+marked separately and their time is not reported as for skipped
+transactions.

+ 
+  What is the 
Transaction Actually Performed in 
pgbench?


+If a transaction has serialization and/or deadlock failures, its
+   time will be reported as serialization 
failure,

+   deadlock failure, or
+   serialization and deadlock failures, respectively.
   
+  
+   
+ Transactions can have both serialization and deadlock failures if 
the

+ used script contained several transactions.  See
+  for more information.
+
+  

+  
+   
+The number of transactions attempts within the interval can be 
greater than
+the number of transactions within this interval multiplied by the 
maximum

+attempts number.  See  for more information.
+   
+  

+   
+ The total sum of per-command failures of each type can 
be greater

+ than the number of transactions with reported failures.
+ See + endterm="transactions-and-scripts-title"> for more 
information.

+ 
+   

And I didn't make rollbacks to savepoints after the failure because they 
cannot help for serialization failures at all: after rollback to 
savepoint a new attempt will be always unsuccessful.



I would consider using "try/tries" instead of "attempt/attempts" as it
is shorter. An English native speaker
opinion would be welcome on that point.


Thank you, I'll change it.


I'm fine with renaming "is_latencies" to "report_per_command", which
is more logical & generic.


Glad to hear it!


"max_attempt_number": I'm against typing fields again in their name,
aka "hungarian naming". I'd suggest
"max_tries" or "max_attempts".


Ok!


"SimpleStats attempts": I disagree with using this floating poiunt
oriented structures to count integers.
I would suggest "int64 tries" instead, which should be enough for the 
purpose.


I'm not sure that it is enough. Firstly it may be several transactions 
in script so to count the average attempts number you should know the 
total number of runned transactions. Secondly I think that stddev for 
attempts number can be quite interesting and often it is not close to 
zero.



LastBeginState -> RetryState? I'm not sure why this state is a pointer
in CState. Putting the struct would avoid malloc/free cycles. Index
"-1" may be used to tell it is not set if necessary.


Thanks, I agree that it's better to do in this way.

"CSTATE_RETRY_FAILED_TRANSACTION" -> "CSTATE_RETRY" is simpler and 
clear enough.


Ok!


In CState and some code, a failure is a failure, maybe one boolean
would be enough. It need only be differentiated when counting, and you
have (deadlock_failure || serialization_failure) everywhere.


I agree with you. I'll change it.


Some variables, such as "int attempt_number", should be in the client
structure, not in the client? Generally, try to use block variables if
possible to keep the state clearly disjoints. If there could be NO new
variable at the doCustom level that would be great, because that would
ensure that there is no machine state mixup hidden in these variables.


Do you mean the code cleanup for doCustom function? Because if I do so 
there will be two code styles for state blocks and their variables in 
this function..



I wondering whether the RETRY & FAILURE states could/should be merged:

  on RETRY:
-> count retry
-> actually retry if < max_tries (reset client state, jump to 
command)

-> else count failure and skip to end of script

The start and end of transaction detection seem expensive (malloc,
...) and assume a one statement per command (what about "BEGIN \; ...
\; COMMIT;", which is not necessarily the case, this limitation should
be documented. ISTM that the space normalization should be avoided,
and something simpler/lighter should be devised? Possibly it should
consider handling SAVEPOINT.


I divided these states because if there's a failed transaction block you 
should end it before retrying. It means to go to states 
CSTATE_START_COMMAND -> CSTATE_WAIT_RESULT -> CSTATE_END_COMMAND with 
the appropriate command. How do you propose not to go to these states?


About 

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-12 Thread Fabien COELHO


LastBeginState -> RetryState? I'm not sure why this state is a pointer in 
CState. Putting the struct would avoid malloc/free cycles. Index "-1" may be 
used to tell it is not set if necessary.


Another detail I forgot about this point: there may be a memory leak on 
variables copies, ISTM that the "variables" array is never freed.


I was not convinced by the overall memory management around variables to 
begin with, and it is even less so with their new copy management. Maybe 
having a clean "Variables" data structure could help improve the 
situation.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-12 Thread Fabien COELHO


Hello Marina,

There's the second version of my patch for pgbench. Now transactions 
with serialization and deadlock failures are rolled back and retried 
until they end successfully or their number of attempts reaches maximum.



In details:
 - You can set the maximum number of attempts by the appropriate 
benchmarking option (--max-attempts-number). Its default value is 1 
partly because retrying of shell commands can produce new errors.


 - Statistics of attempts and failures is printed in progress, in 
transaction / aggregation logs and in the end with other results (all 
and for each script). The transaction failure is reported here only if 
the last retry of this transaction fails.


- Also failures and average numbers of transactions attempts are printed 
per-command with average latencies if you use the appropriate 
benchmarking option (--report-per-command, -r) (it replaces the option 
--report-latencies as I was advised here [1]). Average numbers of 
transactions attempts are printed only for commands which start 
transactions.


As usual: TAP tests for new functionality and changed documentation with 
new examples.


Here are a round of comments on the current version of the patch:

* About the feature

There is a latent issue about what is a transaction. For pgbench a transaction 
is a full script execution.
For postgresql, it is a statement or a BEGIN/END block, several of which may 
appear in a script. From a retry
perspective, you may retry from a SAVEPOINT within a BEGIN/END block... I'm not 
sure how to make general sense
of all this, so this is just a comment without attached action for now.

As the default is not to retry, which is the upward compatible behavior, I 
think that the changes should not
change much the current output bar counting the number of failures.

I would consider using "try/tries" instead of "attempt/attempts" as it is 
shorter. An English native speaker
opinion would be welcome on that point.

* About the code

ISTM that the code interacts significantly with various patches under review or 
ready for committers.
Not sure how to deal with that, there will be some rebasing work...

I'm fine with renaming "is_latencies" to "report_per_command", which is more 
logical & generic.

"max_attempt_number": I'm against typing fields again in their name, aka "hungarian 
naming". I'd suggest
"max_tries" or "max_attempts".

"SimpleStats attempts": I disagree with using this floating poiunt oriented 
structures to count integers.
I would suggest "int64 tries" instead, which should be enough for the 
purpose.


LastBeginState -> RetryState? I'm not sure why this state is a pointer in 
CState. Putting the struct would avoid malloc/free cycles. Index "-1" may 
be used to tell it is not set if necessary.


"CSTATE_RETRY_FAILED_TRANSACTION" -> "CSTATE_RETRY" is simpler and clear enough.

In CState and some code, a failure is a failure, maybe one boolean would 
be enough. It need only be differentiated when counting, and you have 
(deadlock_failure || serialization_failure) everywhere.


Some variables, such as "int attempt_number", should be in the client 
structure, not in the client? Generally, try to use block variables if 
possible to keep the state clearly disjoints. If there could be NO new 
variable at the doCustom level that would be great, because that would 
ensure that there is no machine state mixup hidden in these variables.


I wondering whether the RETRY & FAILURE states could/should be merged:

  on RETRY:
-> count retry
-> actually retry if < max_tries (reset client state, jump to command)
-> else count failure and skip to end of script

The start and end of transaction detection seem expensive (malloc, ...) 
and assume a one statement per command (what about "BEGIN \; ... \; 
COMMIT;", which is not necessarily the case, this limitation should be 
documented. ISTM that the space normalization should be avoided, and 
something simpler/lighter should be devised? Possibly it should consider 
handling SAVEPOINT.


I disagree about exit in ParseScript if the transaction block is not 
completed, especially as it misses out on combined statements/queries 
(BEGIN \; stuff... \; COMMIT") and would break an existing feature.


There are strange characters things in comments, eg "??ontinuous".

Option "max-attempt-number" -> "max-tries"

I would put the client random state initialization with the state 
intialization, not with the connection.


* About tracing

Progress is expected to be short, not detailed. Only add the number of 
failures and retries if max retry is not 1.


* About reporting

I think that too much is reported. I advised to do that, but nevertheless 
it is a little bit steep.


At least, it should not report the number of tries/attempts when the max 
number is one. Simple counting should be reported for failures, not 
floats...


I would suggest a more compact one-line report about failures:

  "number of failures: 12 (0.001%, deadlock: 7, 

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-10 Thread Marina Polyakova

Hello everyone!

There's the second version of my patch for pgbench. Now transactions 
with serialization and deadlock failures are rolled back and retried 
until they end successfully or their number of attempts reaches maximum.


In details:
- You can set the maximum number of attempts by the appropriate 
benchmarking option (--max-attempts-number). Its default value is 1 
partly because retrying of shell commands can produce new errors.
- Statistics of attempts and failures is printed in progress, in 
transaction / aggregation logs and in the end with other results (all 
and for each script). The transaction failure is reported here only if 
the last retry of this transaction fails.
- Also failures and average numbers of transactions attempts are printed 
per-command with average latencies if you use the appropriate 
benchmarking option (--report-per-command, -r) (it replaces the option 
--report-latencies as I was advised here [1]). Average numbers of 
transactions attempts are printed only for commands which start 
transactions.


As usual: TAP tests for new functionality and changed documentation with 
new examples.


Patch is attached. Any suggestions are welcome!

[1] 
https://www.postgresql.org/message-id/alpine.DEB.2.20.1707031321370.3419%40lancre


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres CompanyFrom 58f51cdc896af801bcd35e495406655ca03aa6ce Mon Sep 17 00:00:00 2001
From: Marina Polyakova 
Date: Mon, 10 Jul 2017 13:33:41 +0300
Subject: [PATCH v2] Pgbench Retry transactions with serialization or deadlock
 errors

Now transactions with serialization or deadlock failures can be rolled back and
retried again and again until they end successfully or their number of attempts
reaches maximum. You can set the maximum number of attempts by the appropriate
benchmarking option (--max-attempts-number). Its default value is 1. Statistics
of attempts and failures is printed in progress, in transaction / aggregation
logs and in the end with other results (all and for each script). The
transaction failure is reported here only if the last retry of this transaction
fails. Also failures and average numbers of transactions attempts are printed
per-command with average latencies if you use the appropriate benchmarking
option (--report-per-command, -r). Average numbers of transactions attempts are
printed only for commands which start transactions.
---
 doc/src/sgml/ref/pgbench.sgml  | 277 ++--
 src/bin/pgbench/pgbench.c  | 751 ++---
 src/bin/pgbench/t/002_serialization_errors.pl  | 121 
 src/bin/pgbench/t/003_deadlock_errors.pl   | 130 
 src/bin/pgbench/t/004_retry_failed_transactions.pl | 280 
 5 files changed, 1421 insertions(+), 138 deletions(-)
 create mode 100644 src/bin/pgbench/t/002_serialization_errors.pl
 create mode 100644 src/bin/pgbench/t/003_deadlock_errors.pl
 create mode 100644 src/bin/pgbench/t/004_retry_failed_transactions.pl

diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 64b043b..dc1daa9 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -49,22 +49,34 @@
 
 
 transaction type: builtin: TPC-B (sort of)
+transaction maximum attempts number: 1
 scaling factor: 10
 query mode: simple
 number of clients: 10
 number of threads: 1
 number of transactions per client: 1000
 number of transactions actually processed: 1/1
+number of transactions with serialization failures: 0 (0.000 %)
+number of transactions with deadlock failures: 0 (0.000 %)
+attempts number average = 1.00
+attempts number stddev = 0.00
 tps = 85.184871 (including connections establishing)
 tps = 85.296346 (excluding connections establishing)
 
 
-  The first six lines report some of the most important parameter
+  The first seven lines report some of the most important parameter
   settings.  The next line reports the number of transactions completed
   and intended (the latter being just the product of number of clients
   and number of transactions per client); these will be equal unless the run
   failed before completion.  (In -T mode, only the actual
   number of transactions is printed.)
+  The next four lines report the number of transactions with serialization and
+  deadlock failures, and also the statistics of transactions attempts.  With
+  such errors, transactions are rolled back and are repeated again and again
+  until they end sucessufully or their number of attempts reaches maximum (to
+  change this maximum see the appropriate benchmarking option
+  --max-attempts-number).  The transaction failure is reported here
+  only if the last retry of this transaction fails.
   The last two lines report the number of transactions per second,
   figured with and without counting the time to start database sessions.
  
@@ -434,24 +446,28 @@ pgbench  options  dbname
   

 

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-07 Thread Alexander Korotkov
On Thu, Jun 15, 2017 at 10:16 PM, Andres Freund  wrote:

> On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote:
> > Advanced options:
> > - mostly for testing built-in scripts: you can set the default
> transaction
> > isolation level by the appropriate benchmarking option (-I);
>
> I'm less convinced of the need of htat, you can already set arbitrary
> connection options with
> PGOPTIONS='-c default_transaction_isolation=serializable' pgbench
>

Right, there is already way to specify default isolation level using
environment variables.
However, once we make pgbench work with various isolation levels, users may
want to run pgbench multiple times in a row with different isolation
levels.  Command line option would be very convenient in this case.
In addition, isolation level is vital parameter to interpret benchmark
results correctly.  Often, graphs with pgbench results are entitled with
pgbench command line.  Having, isolation level specified in command line
would naturally fit into this entitling scheme.
Of course, this is solely usability question and it's fair enough to live
without such a command line option.  But I'm +1 to add this option.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-03 Thread Fabien COELHO



The number of retries and maybe failures should be counted, maybe with
some adjustable maximum, as suggested.


If we fix the maximum number of attempts the maximum number of failures 
for one script execution will be bounded above 
(number_of_transactions_in_script * maximum_number_of_attempts). Do you 
think we should make the option in program to limit this number much more?


Probably not. I think that there should be a configurable maximum of
retries on a transaction, which may be 0 by default if we want to be
upward compatible with the current behavior, or maybe something else.


I propose the option --max-attempts-number=NUM which NUM cannot be less than 
1. I propose it because I think that, for example, --max-attempts-number=100 
is better than --max-retries-number=99. And maybe it's better to set its 
default value to 1 too because retrying of shell commands can produce new 
errors..


Personnaly, I like counting retries because it also counts the number of 
time the transaction actually failed for some reason. But this is a 
marginal preference, and one can be switchted to the other easily.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-03 Thread Marina Polyakova

The current error handling is either "close connection" or maybe in
some cases even "exit". If this is changed, then the client may
continue execution in some unforseen state and behave unexpectedly.
We'll see.


Thanks, now I understand this.


ISTM that the retry implementation should be implemented somehow in
the automaton, restarting the same script for the beginning.


If there are several transactions in this script - don't you think 
that we should restart only the failed transaction?..


On some transaction failures based on their status. My point is that
the retry process must be implemented clearly with a new state in the
client automaton. Exactly when the transition to this new state must
be taken is another issue.


About it, I agree with you that it should be done in this way.

The number of retries and maybe failures should be counted, maybe 
with

some adjustable maximum, as suggested.


If we fix the maximum number of attempts the maximum number of 
failures for one script execution will be bounded above 
(number_of_transactions_in_script * maximum_number_of_attempts). Do 
you think we should make the option in program to limit this number 
much more?


Probably not. I think that there should be a configurable maximum of
retries on a transaction, which may be 0 by default if we want to be
upward compatible with the current behavior, or maybe something else.


I propose the option --max-attempts-number=NUM which NUM cannot be less 
than 1. I propose it because I think that, for example, 
--max-attempts-number=100 is better than --max-retries-number=99. And 
maybe it's better to set its default value to 1 too because retrying of 
shell commands can produce new errors..



In doLog, added columns should be at the end of the format.


I have inserted it earlier because these columns are not optional. Do 
you think they should be optional?


I think that new non-optional columns it should be at the end of the
existing non-optional columns so that existing scripts which may
process the output may not need to be updated.


Thanks, I agree with you :)


I'm not sure that there should be an new option to report failures,
the information when relevant should be integrated in a clean format
into the existing reports... Maybe the "per command latency"
report/option should be renamed if it becomes more general.


I have tried do not change other parts of program as much as possible. 
But if you think that it will be more useful to change the option I'll 
do it.


I think that the option should change if its naming becomes less
relevant, which is to be determined. AFAICS, ISTM that new measures
should be added to the various existing reports unconditionnaly (i.e.
without a new option), so maybe no new option would be needed.


Thanks! I didn't think about it in this way..

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-03 Thread Fabien COELHO


Hello Marina,


I agree that improving the error handling ability of pgbench is a good
thing, although I'm not sure about the implications...


Could you tell a little bit more exactly.. What implications are you worried 
about?


The current error handling is either "close connection" or maybe in some 
cases even "exit". If this is changed, then the client may continue 
execution in some unforseen state and behave unexpectedly. We'll see.



ISTM that the retry implementation should be implemented somehow in
the automaton, restarting the same script for the beginning.


If there are several transactions in this script - don't you think that we 
should restart only the failed transaction?..


On some transaction failures based on their status. My point is that the 
retry process must be implemented clearly with a new state in the client 
automaton. Exactly when the transition to this new state must be taken is 
another issue.



The number of retries and maybe failures should be counted, maybe with
some adjustable maximum, as suggested.


If we fix the maximum number of attempts the maximum number of failures for 
one script execution will be bounded above (number_of_transactions_in_script 
* maximum_number_of_attempts). Do you think we should make the option in 
program to limit this number much more?


Probably not. I think that there should be a configurable maximum of 
retries on a transaction, which may be 0 by default if we want to be 
upward compatible with the current behavior, or maybe something else.



In doLog, added columns should be at the end of the format.


I have inserted it earlier because these columns are not optional. Do you 
think they should be optional?


I think that new non-optional columns it should be at the end of the 
existing non-optional columns so that existing scripts which may process 
the output may not need to be updated.



I'm not sure that there should be an new option to report failures,
the information when relevant should be integrated in a clean format
into the existing reports... Maybe the "per command latency"
report/option should be renamed if it becomes more general.


I have tried do not change other parts of program as much as possible. But if 
you think that it will be more useful to change the option I'll do it.


I think that the option should change if its naming becomes less relevant, 
which is to be determined. AFAICS, ISTM that new measures should be added 
to the various existing reports unconditionnaly (i.e. without a new 
option), so maybe no new option would be needed.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-03 Thread Marina Polyakova

Hello Marina,


Hello, Fabien!


A few comments about the submitted patches.


Thank you very much for them!


I agree that improving the error handling ability of pgbench is a good
thing, although I'm not sure about the implications...


Could you tell a little bit more exactly.. What implications are you 
worried about?



About the "retry" discussion: I agree that retry is the relevant
option from an application point of view.


I'm glad to hear it!


ISTM that the retry implementation should be implemented somehow in
the automaton, restarting the same script for the beginning.


If there are several transactions in this script - don't you think that 
we should restart only the failed transaction?..



As pointed out in the discussion, the same values/commands should be
executed, which suggests that random generated values should be the
same on the retry runs, so that for a simple script the same
operations are attempted. This means that the random generator state
must be kept & reinstated for a client on retries. Currently the
random state is in the thread, which is not convenient for this
purpose, so it should be moved in the client so that it can be saved
at transaction start and reinstated on retries.


I think about it in the same way =)


The number of retries and maybe failures should be counted, maybe with
some adjustable maximum, as suggested.


If we fix the maximum number of attempts the maximum number of failures 
for one script execution will be bounded above 
(number_of_transactions_in_script * maximum_number_of_attempts). Do you 
think we should make the option in program to limit this number much 
more?



About 0001:

In accumStats, just use one level if, the two levels bring nothing.


Thanks, I agree =[


In doLog, added columns should be at the end of the format.


I have inserted it earlier because these columns are not optional. Do 
you think they should be optional?



The number
of column MUST NOT change when different issues arise, so that it
works well with cut/... unix commands, so inserting a sentence such as
"serialization and deadlock failures" is a bad idea.


Thanks, I agree again.


threadRun: the point of the progress format is to fit on one not too
wide line on a terminal and to allow some simple automatic processing.
Adding a verbose sentence in the middle of it is not the way to go.


I was thinking about it.. Thanks, I'll try to make it shorter.


About tests: I do not understand why test 003 includes 2 transactions.
It would seem more logical to have two scripts.


Ok!


About 0003:

I'm not sure that there should be an new option to report failures,
the information when relevant should be integrated in a clean format
into the existing reports... Maybe the "per command latency"
report/option should be renamed if it becomes more general.


I have tried do not change other parts of program as much as possible. 
But if you think that it will be more useful to change the option I'll 
do it.



About 0004:

The documentation must not be in a separate patch, but in the same
patch as their corresponding code.


Ok!

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-07-01 Thread Fabien COELHO


Hello Marina,

A few comments about the submitted patches.

I agree that improving the error handling ability of pgbench is a good 
thing, although I'm not sure about the implications...


About the "retry" discussion: I agree that retry is the relevant option 
from an application point of view.


ISTM that the retry implementation should be implemented somehow in the 
automaton, restarting the same script for the beginning.


As pointed out in the discussion, the same values/commands should be 
executed, which suggests that random generated values should be the same 
on the retry runs, so that for a simple script the same operations are 
attempted. This means that the random generator state must be kept & 
reinstated for a client on retries. Currently the random state is in the 
thread, which is not convenient for this purpose, so it should be moved in 
the client so that it can be saved at transaction start and reinstated on 
retries.


The number of retries and maybe failures should be counted, maybe with 
some adjustable maximum, as suggested.


About 0001:

In accumStats, just use one level if, the two levels bring nothing.

In doLog, added columns should be at the end of the format. The number of 
column MUST NOT change when different issues arise, so that it works well 
with cut/... unix commands, so inserting a sentence such as "serialization 
and deadlock failures" is a bad idea.


threadRun: the point of the progress format is to fit on one not too wide 
line on a terminal and to allow some simple automatic processing. Adding a 
verbose sentence in the middle of it is not the way to go.


About tests: I do not understand why test 003 includes 2 transactions. 
It would seem more logical to have two scripts.


About 0003:

I'm not sure that there should be an new option to report failures, the 
information when relevant should be integrated in a clean format into the 
existing reports... Maybe the "per command latency" report/option should 
be renamed if it becomes more general.


About 0004:

The documentation must not be in a separate patch, but in the same patch 
as their corresponding code.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-17 Thread Marina Polyakova
> To be clear, part of "retrying from the beginning" means that if a> result from one statement is used to determine the content (or> whether to run) a subsequent statement, that first statement must be> run in the new transaction and the results evaluated again to> determine what to use for the later statement.  You can't simply> replay the statements that were run during the first try.  For> examples, to help get a feel of why that is, see:> > https://wiki.postgresql.org/wiki/SSIThank you again! :))-- Marina PolyakovaPostgres Professional: http://www.postgrespro.comThe Russian Postgres Company

Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-16 Thread Kevin Grittner
On Fri, Jun 16, 2017 at 5:31 AM, Marina Polyakova
 wrote:

> And thank you very much for your explanation how and why transactions with
> failures should be retried! I'll try to implement all of it.

To be clear, part of "retrying from the beginning" means that if a
result from one statement is used to determine the content (or
whether to run) a subsequent statement, that first statement must be
run in the new transaction and the results evaluated again to
determine what to use for the later statement.  You can't simply
replay the statements that were run during the first try.  For
examples, to help get a feel of why that is, see:

https://wiki.postgresql.org/wiki/SSI

--
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-16 Thread Marina Polyakova

>> P.S. Does this use case (do not retry transaction with serialization or
>> deadlock failure) is most interesting or failed transactions should be
>> retried (and how much times if there seems to be no hope of success...)?
>
> I can't quite parse that sentence, could you restate?

The way I read it was that the most interesting solution would retry
a transaction from the beginning on a serialization failure or
deadlock failure.


As far as I understand her proposal, it is exactly the opposite -- if a
transaction fails, it is discarded.  And this P.S. note is asking
whether this is a good idea, or would we prefer that failing
transactions are retried.


Yes, I have meant this, thank you!


I think it's pretty obvious that transactions that failed with
some serializability problem should be retried.


Thank you voted :)

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-16 Thread Marina Polyakova
P.S. Does this use case (do not retry transaction with serialization 
or
deadlock failure) is most interesting or failed transactions should 
be
retried (and how much times if there seems to be no hope of 
success...)?


I can't quite parse that sentence, could you restate?


The way I read it was that the most interesting solution would retry
a transaction from the beginning on a serialization failure or
deadlock failure.  Most people who use serializable transactions (at
least in my experience) run though a framework that does that
automatically, regardless of what client code initiated the
transaction.  These retries are generally hidden from the client
code -- it just looks like the transaction took a bit longer.
Sometimes people will have a limit on the number of retries.  I
never used such a limit and never had a problem, because our
implementation of serializable transactions will not throw a
serialization failure error until one of the transactions involved
in causing it has successfully committed -- meaning that the retry
can only hit this again on a *new* set of transactions.

Essentially, the transaction should only count toward the TPS rate
when it eventually completes without a serialization failure.

Marina, did I understand you correctly?


Álvaro Herrera in next message of this thread has understood my text 
right:



As far as I understand her proposal, it is exactly the opposite -- if a
transaction fails, it is discarded.  And this P.S. note is asking
whether this is a good idea, or would we prefer that failing
transactions are retried.


And thank you very much for your explanation how and why transactions 
with failures should be retried! I'll try to implement all of it.


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-16 Thread Marina Polyakova

Hi,


Hello!


I think that's a good idea and sorely needed.


Thanks, I'm very glad to hear it!

- if there were these failures during script execution this 
"transaction" is

marked
appropriately in logs;
- numbers of "transactions" with these failures are printed in 
progress, in

aggregation logs and in the end with other results (all and for each
script);


I guess that'll include a "rolled-back %' or 'retried %' somewhere?


Not exactly, see documentation:

+   If transaction has serialization / deadlock failure or them both 
(last thing

+   is possible if used script contains several transactions; see
+for more information), its
+   time will be reported as serialization 
failure /

+   deadlock failure /
+   serialization and deadlock failures appropriately.

+   Example with serialization, deadlock and both these failures:
+
+1 128 24968 0 1496759158 426984
+0 129 serialization failure 0 1496759158 427023
+3 129 serialization failure 0 1496759158 432662
+2 128 serialization failure 0 1496759158 432765
+0 130 deadlock failure 0 1496759159 460070
+1 129 serialization failure 0 1496759160 485188
+2 129 serialization and deadlock failures 0 1496759160 485339
+4 130 serialization failure 0 1496759160 485465
+

I have understood proposals in next messages of this thread that the 
most interesting case is to retry failed transaction. Do you think it's 
better to write for example "rolled-back after % retries (serialization 
failure)' or "time (retried % times, serialization and deadlock 
failures)'?



Advanced options:
- mostly for testing built-in scripts: you can set the default 
transaction

isolation level by the appropriate benchmarking option (-I);


I'm less convinced of the need of htat, you can already set arbitrary
connection options with
PGOPTIONS='-c default_transaction_isolation=serializable' pgbench


Oh, thanks, I forgot about it =[

P.S. Does this use case (do not retry transaction with serialization 
or

deadlock failure) is most interesting or failed transactions should be
retried (and how much times if there seems to be no hope of 
success...)?



I can't quite parse that sentence, could you restate?


Álvaro Herrera later in this thread has understood my text right:


As far as I understand her proposal, it is exactly the opposite -- if a
transaction fails, it is discarded.  And this P.S. note is asking
whether this is a good idea, or would we prefer that failing
transactions are retried.


With his explanation has my text become clearer?

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Kevin Grittner
On Thu, Jun 15, 2017 at 4:18 PM, Alvaro Herrera
 wrote:
> Kevin Grittner wrote:

> As far as I understand her proposal, it is exactly the opposite -- if a
> transaction fails, it is discarded.  And this P.S. note is asking
> whether this is a good idea, or would we prefer that failing
> transactions are retried.
>
> I think it's pretty obvious that transactions that failed with
> some serializability problem should be retried.

Agreed all around.

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Thomas Munro
On Fri, Jun 16, 2017 at 9:18 AM, Alvaro Herrera
 wrote:
> Kevin Grittner wrote:
>> On Thu, Jun 15, 2017 at 2:16 PM, Andres Freund  wrote:
>> > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote:
>
>> >> P.S. Does this use case (do not retry transaction with serialization or
>> >> deadlock failure) is most interesting or failed transactions should be
>> >> retried (and how much times if there seems to be no hope of success...)?
>> >
>> > I can't quite parse that sentence, could you restate?
>>
>> The way I read it was that the most interesting solution would retry
>> a transaction from the beginning on a serialization failure or
>> deadlock failure.
>
> As far as I understand her proposal, it is exactly the opposite -- if a
> transaction fails, it is discarded.  And this P.S. note is asking
> whether this is a good idea, or would we prefer that failing
> transactions are retried.
>
> I think it's pretty obvious that transactions that failed with
> some serializability problem should be retried.

+1 for retry with reporting of retry rates

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Alvaro Herrera
Kevin Grittner wrote:
> On Thu, Jun 15, 2017 at 2:16 PM, Andres Freund  wrote:
> > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote:

> >> P.S. Does this use case (do not retry transaction with serialization or
> >> deadlock failure) is most interesting or failed transactions should be
> >> retried (and how much times if there seems to be no hope of success...)?
> >
> > I can't quite parse that sentence, could you restate?
> 
> The way I read it was that the most interesting solution would retry
> a transaction from the beginning on a serialization failure or
> deadlock failure.

As far as I understand her proposal, it is exactly the opposite -- if a
transaction fails, it is discarded.  And this P.S. note is asking
whether this is a good idea, or would we prefer that failing
transactions are retried.

I think it's pretty obvious that transactions that failed with
some serializability problem should be retried.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Kevin Grittner
On Thu, Jun 15, 2017 at 2:16 PM, Andres Freund  wrote:
> On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote:

>> I suggest a patch where pgbench client sessions are not disconnected because
>> of serialization or deadlock failures and these failures are mentioned in
>> reports.
>
> I think that's a good idea and sorely needed.

+1

>> P.S. Does this use case (do not retry transaction with serialization or
>> deadlock failure) is most interesting or failed transactions should be
>> retried (and how much times if there seems to be no hope of success...)?
>
> I can't quite parse that sentence, could you restate?

The way I read it was that the most interesting solution would retry
a transaction from the beginning on a serialization failure or
deadlock failure.  Most people who use serializable transactions (at
least in my experience) run though a framework that does that
automatically, regardless of what client code initiated the
transaction.  These retries are generally hidden from the client
code -- it just looks like the transaction took a bit longer.
Sometimes people will have a limit on the number of retries.  I
never used such a limit and never had a problem, because our
implementation of serializable transactions will not throw a
serialization failure error until one of the transactions involved
in causing it has successfully committed -- meaning that the retry
can only hit this again on a *new* set of transactions.

Essentially, the transaction should only count toward the TPS rate
when it eventually completes without a serialization failure.

Marina, did I understand you correctly?

-- 
Kevin Grittner
VMware vCenter Server
https://www.vmware.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Andres Freund
Hi,

On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote:
> Now in pgbench we can test only transactions with Read Committed isolation
> level because client sessions are disconnected forever on serialization
> failures. There were some proposals and discussions about it (see message
> here [1] and thread here [2]).

> I suggest a patch where pgbench client sessions are not disconnected because
> of serialization or deadlock failures and these failures are mentioned in
> reports.

I think that's a good idea and sorely needed.


In details:


> - if there were these failures during script execution this "transaction" is
> marked
> appropriately in logs;
> - numbers of "transactions" with these failures are printed in progress, in
> aggregation logs and in the end with other results (all and for each
> script);

I guess that'll include a "rolled-back %' or 'retried %' somewhere?


> Advanced options:
> - mostly for testing built-in scripts: you can set the default transaction
> isolation level by the appropriate benchmarking option (-I);

I'm less convinced of the need of htat, you can already set arbitrary
connection options with
PGOPTIONS='-c default_transaction_isolation=serializable' pgbench


> P.S. Does this use case (do not retry transaction with serialization or
> deadlock failure) is most interesting or failed transactions should be
> retried (and how much times if there seems to be no hope of success...)?

I can't quite parse that sentence, could you restate?

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Marina Polyakova

Sounds like a good idea.


Thank you!


Please add to the next CommitFest


Done: https://commitfest.postgresql.org/14/1170/


and review
somebody else's patch in exchange for having your own patch reviewed.


Of course, I remember about it.

--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors

2017-06-15 Thread Robert Haas
On Wed, Jun 14, 2017 at 4:48 AM, Marina Polyakova
 wrote:
> Now in pgbench we can test only transactions with Read Committed isolation
> level because client sessions are disconnected forever on serialization
> failures. There were some proposals and discussions about it (see message
> here [1] and thread here [2]).
>
> I suggest a patch where pgbench client sessions are not disconnected because
> of serialization or deadlock failures and these failures are mentioned in
> reports. In details:
> - transaction with one of these failures continue run normally, but its
> result is rolled back;
> - if there were these failures during script execution this "transaction" is
> marked
> appropriately in logs;
> - numbers of "transactions" with these failures are printed in progress, in
> aggregation logs and in the end with other results (all and for each
> script);
>
> Advanced options:
> - mostly for testing built-in scripts: you can set the default transaction
> isolation level by the appropriate benchmarking option (-I);
> - for more detailed reports: to know per-statement serialization and
> deadlock failures you can use the appropriate benchmarking option
> (--report-failures).
>
> Also: TAP tests for new functionality and changed documentation with new
> examples.
>
> Patches are attached. Any suggestions are welcome!

Sounds like a good idea.  Please add to the next CommitFest and review
somebody else's patch in exchange for having your own patch reviewed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers