Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello, Here is the third version of the patch for pgbench thanks to Fabien Coelho comments. As in the previous one, transactions with serialization and deadlock failures are rolled back and retried until they end successfully or their number of tries reaches maximum. Here is some partial review. Patch applies cleanly. It compiles with warnings, please fix them: pgbench.c:2624:28: warning: ‘failure_status’ may be used uninitialized in this function pgbench.c:2697:34: warning: ‘command’ may be used uninitialized in this function I do not think that the error handling feature needs preeminence in the final report, compare to scale, number of clients and so. The number of tries should be put further on. I would spell "number of tries" instead of "tries number" which seems to suggest that each try is attributed a number. "sql" -> "SQL". For the per statement latency final report, I do not think it is worth distinguishing the kind of retry at this level, because ISTM that serialization & deadlocks are unlikely to appear simultaneously. I would just report total failures and total tries on this report. We only have 2 errors now, but if more are added I'm pretty sure that we would not want to have more columns... Moreover the 25 characters alignment is ugly, better use a much smaller alignment. I'm okay with having details shown in the "log to file" group report. The documentation does not seem consistent. It discusses "the very last fields" and seem to suggest that there are two, but the example trace below just adds one field. If you want a paragraph you should add , skipping a line does not work (around "All values are computed for ..."). I do not understand the second note of the --max-tries documentation. It seems to suggest that some script may not end their own transaction... which should be an error in my opinion? Some explanations would be welcome. I'm not sure that "Retries" deserves a type of its own for two counters. The "retries" in RetriesState may be redundant with these. The failures are counted on simple counters while retries have a type, this is not consistent. I suggest to just use simple counters everywhere. I'm ok with having the detail report tell about failures & retries only when some occured. typo: sucessufully -> successfully If a native English speaker could provide an opinion on that, and more generally review the whole documentation, it would be great. I think that the rand functions should really take a random_state pointer argument, not a Thread or Client. I'm at odds that FailureStatus does not have a clean NO_FAILURE state, and that it is merged with misc failures. I'm not sure that initRetries, mergeRetries, getAllRetries really deserve a function. I do not thing that there should be two accum Functions. Just extend the existing one, and adding zero to zero is not a problem. I guess that in the end pgbench & psql variables will have to be merged if pgbench expression engine is to be used by psql as well, but this is not linked to this patch. The tap tests seems over-complicated and heavy with two pgbench run in parallel... I'm not sure we really want all that complexity for this somehow small feature. Moreover pgbench can run several scripts, I'm not sure why two pgbench would need to be invoked. Could something much simpler and lighter be proposed instead to test the feature? The added code does not conform to Pg C style. For instance, if brace should be aligned to the if. Please conform the project style. The is_transaction_block_end seems simplistic. ISTM that it would not work with compound commands. It should be clearly documented somewhere. Also find attached two scripts I used for some testing: psql < dl_init.sql pgbench -f dl_trans.sql -c 8 -T 10 -P 1 -- Fabien. dl_init.sql Description: application/sql dl_trans.sql Description: application/sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hi, Hello! Just had a need for this feature, and took this to a short test drive. So some comments: - it'd be useful to display a retry percentage of all transactions, similar to what's displayed for failed transactions. - it'd be useful to also conveniently display the number of retried transactions, rather than the total number of retries. Ok! - it appears that we now unconditionally do not disregard a connection after a serialization / deadlock failure. Good. But that's useful far beyond just deadlocks / serialization errors, and should probably be exposed. I agree that it will be useful. But how do you propose to print the results if there are many types of errors? I'm afraid that the progress report can be very long although it is expected that it will be rather short [1]. The per statement report can also be very long.. Nice feature! Thanks and thank you for your comments :) [1] https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121142300.12795%40lancre -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Fri, Aug 11, 2017 at 10:50 PM, Andres Freundwrote: > On 2017-07-21 19:32:02 +0300, Marina Polyakova wrote: > > Here is the third version of the patch for pgbench thanks to Fabien > Coelho > > comments. As in the previous one, transactions with serialization and > > deadlock failures are rolled back and retried until they end > successfully or > > their number of tries reaches maximum. > > Just had a need for this feature, and took this to a short test > drive. So some comments: > - it'd be useful to display a retry percentage of all transactions, > similar to what's displayed for failed transactions. > - it appears that we now unconditionally do not disregard a connection > after a serialization / deadlock failure. Good. But that's useful far > beyond just deadlocks / serialization errors, and should probably be > exposed. > Yes, it would be nice to don't disregard a connection after other errors too. However, I'm not sure if we should retry the *same* transaction on errors beyond deadlocks / serialization errors. For example, in case of division by zero or unique violation error it would be more natural to give up with current transaction and continue with next one. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hi, On 2017-07-21 19:32:02 +0300, Marina Polyakova wrote: > Here is the third version of the patch for pgbench thanks to Fabien Coelho > comments. As in the previous one, transactions with serialization and > deadlock failures are rolled back and retried until they end successfully or > their number of tries reaches maximum. Just had a need for this feature, and took this to a short test drive. So some comments: - it'd be useful to display a retry percentage of all transactions, similar to what's displayed for failed transactions. - it appears that we now unconditionally do not disregard a connection after a serialization / deadlock failure. Good. But that's useful far beyond just deadlocks / serialization errors, and should probably be exposed. - it'd be useful to also conveniently display the number of retried transactions, rather than the total number of retries. Nice feature! - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello again! Here is the third version of the patch for pgbench thanks to Fabien Coelho comments. As in the previous one, transactions with serialization and deadlock failures are rolled back and retried until they end successfully or their number of tries reaches maximum. Differences from the previous version: * Some code cleanup :) In particular, the Variables structure for managing client variables and only one new tap tests file (as they were recommended here [1] and here [2]). * There's no error if the last transaction in the script is not completed. But the transactions started in the previous scripts and/or not ending in the current script, are not rolled back and retried after the failure. Such script try is reported as failed because it contains a failure that was not rolled back and retried. * Usually the retries and/or failures are printed if they are not equal to zeros. In transaction/aggregation logs the failures are always printed and the retries are printed if max_tries is greater than 1. It is done for the general format of the log during the execution of the program. Patch is attached. Any suggestions are welcome! [1] https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121338090.12795%40lancre [2] https://www.postgresql.org/message-id/alpine.DEB.2.20.1707121142300.12795%40lancre -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 0ee372e93b8d7017563d2f2f55357c39c08a Mon Sep 17 00:00:00 2001 From: Marina PolyakovaDate: Fri, 21 Jul 2017 17:57:58 +0300 Subject: [PATCH v3] Pgbench Retry transactions with serialization or deadlock errors Now transactions with serialization or deadlock failures can be rolled back and retried again and again until they end successfully or their number of tries reaches maximum. You can set the maximum number of tries by using the appropriate benchmarking option (--max-tries). The default value is 1. If there're retries and/or failures their statistics are printed in the progress, in the transaction / aggregation logs and in the end with other results (all and for each script). A transaction failure is reported here only if the last try of this transaction fails. Also retries and/or failures are printed per-command with average latencies if you use the appropriate benchmarking option (--report-per-command, -r) and the total number of retries and/or failures is not zero. Note that the transactions started in the previous scripts and/or not ending in the current script, are not rolled back and retried after the failure. Such script try is reported as failed because it contains a failure that was not rolled back and retried. --- doc/src/sgml/ref/pgbench.sgml | 240 +- src/bin/pgbench/pgbench.c | 872 + .../t/002_serialization_and_deadlock_failures.pl | 459 +++ 3 files changed, 1412 insertions(+), 159 deletions(-) create mode 100644 src/bin/pgbench/t/002_serialization_and_deadlock_failures.pl diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 64b043b..3bbeec5 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -49,6 +49,7 @@ transaction type: builtin: TPC-B (sort of) +transaction maximum tries number: 1 scaling factor: 10 query mode: simple number of clients: 10 @@ -59,7 +60,7 @@ tps = 85.184871 (including connections establishing) tps = 85.296346 (excluding connections establishing) - The first six lines report some of the most important parameter + The first seven lines report some of the most important parameter settings. The next line reports the number of transactions completed and intended (the latter being just the product of number of clients and number of transactions per client); these will be equal unless the run @@ -436,22 +437,33 @@ pgbench options dbname Show progress report every sec seconds. The report includes the time since the beginning of the run, the tps since the last report, and the transaction latency average and standard -deviation since the last report. Under throttling (-R), -the latency is computed with respect to the transaction scheduled -start time, not the actual transaction beginning time, thus it also -includes the average schedule lag time. +deviation since the last report. If since the last report there are +transactions that ended with serialization/deadlock failures they are +also reported here as failed (see + for more information). Under +throttling (-R), the latency is computed with respect to the +transaction scheduled start time, not the actual transaction beginning +time, thus it also includes the average schedule lag time. If since the +last report there're transactions that have been rolled back and retried +
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Well, the short version may be to only do a full transaction retry and to document that for now savepoints are not handled, and to let that for future work if need arises. I agree with you. For progress the output must be short and readable, and probably we do not care about whether retries came from this or that, so I would let that out. For log and aggregated log possibly that would make more sense, but it must stay easy to parse. Ok! -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Ok, fine. My point was just to check before proceeding. And I'm very grateful for that :) -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
And I'm not sure that we should do all the stuff for savepoints rollbacks because: - as I see it now it only makes sense for the deadlock failures; - if there's a failure what savepoint we should rollback to and start the execution again? ISTM that it is the point of having savepoint in the first place, the ability to restart the transaction at that point if something failed? Maybe to go to the last one, if it is not successful go to the previous one etc. Retrying the entire transaction may take less time.. Well, I do not know that. My 0.02 € is that if there was a savepoint then this is natural the restarting point of a transaction which has some recoverable error. Well, the short version may be to only do a full transaction retry and to document that for now savepoints are not handled, and to let that for future work if need arises. Maybe something like: ... number of failures: 12 (0.004%) number of retries: 64 (deadlocks: 29, serialization: 35) Ok! How to you like the idea to use the same format (the total number of transactions with failures and the number of retries for each failure type) in other places (log, aggregation log, progress) if the values are not "default" (= no failures and no retries)? For progress the output must be short and readable, and probably we do not care about whether retries came from this or that, so I would let that out. For log and aggregated log possibly that would make more sense, but it must stay easy to parse. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
If the answer is no, then implement something in pgbench directly. The structure of variables is different, the container structure of the variables is different, so I think that the answer is no. Ok, fine. My point was just to check before proceeding. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Given that the number of variables of a pgbench script is expected to be pretty small, I'm not sure that the sorting stuff is worth the effort. I think it is a good insurance if there're many variables.. My suggestion is really to look at both implementations and to answer the question "should pgbench share its variable implementation with psql?". If the answer is yes, then the relevant part of the implementation should be moved to fe_utils, and that's it. If the answer is no, then implement something in pgbench directly. The structure of variables is different, the container structure of the variables is different, so I think that the answer is no. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Not necessarily? It depends on where the locks triggering the issue are set, if they are all set after the savepoint it could work on a second attempt. Don't you mean the deadlock failures where can really help rollback to Yes, I mean deadlock failures can rollback to a savepoint and work on a second attempt. And could you, please, give an example where a rollback to savepoint can help to end its subtransaction successfully after a serialization failure? I do not know whether this is possible with about serialization failures. It might be if the stuff before and after the savepoint are somehow unrelated... If you mean, for example, the updates of different tables - a rollback to savepoint doesn't help. And I'm not sure that we should do all the stuff for savepoints rollbacks because: - as I see it now it only makes sense for the deadlock failures; - if there's a failure what savepoint we should rollback to and start the execution again? Maybe to go to the last one, if it is not successful go to the previous one etc. Retrying the entire transaction may take less time.. [...] I mean that the sum of transactions with serialization failure and transactions with deadlock failure can be greater then the totally sum of transactions with failures. Hmmm. Ok. A "failure" is a transaction (in the sense of pgbench) that could not made it to the end, even after retries. If there is a rollback and the a retry which works, it is not a failure. Now deadlock or serialization errors, which trigger retries, are worth counting as well, although they are not "failures". So my format proposal was over optimistic, and the number of deadlocks and serializations should better be on a retry count line. Maybe something like: ... number of failures: 12 (0.004%) number of retries: 64 (deadlocks: 29, serialization: 35) Ok! How to you like the idea to use the same format (the total number of transactions with failures and the number of retries for each failure type) in other places (log, aggregation log, progress) if the values are not "default" (= no failures and no retries)? -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Note that there is something for psql (src/bin/psql/variable.c) which may or may not be shared. It should be checked before recoding eventually the same thing. Thank you very much for pointing this file! As I checked this is another structure: here there's a simple list, while in pgbench we should know if the list is sorted and the number of elements in the list. How do you think, is it a good idea to name a variables structure in pgbench in the same way (VariableSpace) or it should be different not to be confused (Variables, for example)? Given that the number of variables of a pgbench script is expected to be pretty small, I'm not sure that the sorting stuff is worth the effort. My suggestion is really to look at both implementations and to answer the question "should pgbench share its variable implementation with psql?". If the answer is yes, then the relevant part of the implementation should be moved to fe_utils, and that's it. If the answer is no, then implement something in pgbench directly. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello Marina, Not necessarily? It depends on where the locks triggering the issue are set, if they are all set after the savepoint it could work on a second attempt. Don't you mean the deadlock failures where can really help rollback to Yes, I mean deadlock failures can rollback to a savepoint and work on a second attempt. And could you, please, give an example where a rollback to savepoint can help to end its subtransaction successfully after a serialization failure? I do not know whether this is possible with about serialization failures. It might be if the stuff before and after the savepoint are somehow unrelated... [...] I mean that the sum of transactions with serialization failure and transactions with deadlock failure can be greater then the totally sum of transactions with failures. Hmmm. Ok. A "failure" is a transaction (in the sense of pgbench) that could not made it to the end, even after retries. If there is a rollback and the a retry which works, it is not a failure. Now deadlock or serialization errors, which trigger retries, are worth counting as well, although they are not "failures". So my format proposal was over optimistic, and the number of deadlocks and serializations should better be on a retry count line. Maybe something like: ... number of failures: 12 (0.004%) number of retries: 64 (deadlocks: 29, serialization: 35) -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
I was not convinced by the overall memory management around variables to begin with, and it is even less so with their new copy management. Maybe having a clean "Variables" data structure could help improve the situation. Note that there is something for psql (src/bin/psql/variable.c) which may or may not be shared. It should be checked before recoding eventually the same thing. Thank you very much for pointing this file! As I checked this is another structure: here there's a simple list, while in pgbench we should know if the list is sorted and the number of elements in the list. How do you think, is it a good idea to name a variables structure in pgbench in the same way (VariableSpace) or it should be different not to be confused (Variables, for example)? -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On 13-07-2017 19:32, Fabien COELHO wrote: Hello, Hi! [...] I didn't make rollbacks to savepoints after the failure because they cannot help for serialization failures at all: after rollback to savepoint a new attempt will be always unsuccessful. Not necessarily? It depends on where the locks triggering the issue are set, if they are all set after the savepoint it could work on a second attempt. Don't you mean the deadlock failures where can really help rollback to savepoint? And could you, please, give an example where a rollback to savepoint can help to end its subtransaction successfully after a serialization failure? "SimpleStats attempts": I disagree with using this floating poiunt oriented structures to count integers. I would suggest "int64 tries" instead, which should be enough for the purpose. I'm not sure that it is enough. Firstly it may be several transactions in script so to count the average attempts number you should know the total number of runned transactions. Secondly I think that stddev for attempts number can be quite interesting and often it is not close to zero. I would prefer to have a real motivation to add this complexity in the report and in the code. Without that, a simple int seems better for now. It can be improved later if the need really arises. Ok! Some variables, such as "int attempt_number", should be in the client structure, not in the client? Generally, try to use block variables if possible to keep the state clearly disjoints. If there could be NO new variable at the doCustom level that would be great, because that would ensure that there is no machine state mixup hidden in these variables. Do you mean the code cleanup for doCustom function? Because if I do so there will be two code styles for state blocks and their variables in this function.. I think that any variable shared between state is a recipee for bugs if it is not reset properly, so they should be avoided. Maybe there are already too many of them, then too bad, not a reason to add more. The status before the automaton was a nightmare. Ok! I would suggest a more compact one-line report about failures: "number of failures: 12 (0.001%, deadlock: 7, serialization: 5)" I think, there may be a misunderstanding. Because script can contain several transactions and get both failures. I do not understand. Both failures number are on the compact line I suggested. I mean that the sum of transactions with serialization failure and transactions with deadlock failure can be greater then the totally sum of transactions with failures. But if you think it's ok I'll change it and write the appropriate note in documentation. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
I was not convinced by the overall memory management around variables to begin with, and it is even less so with their new copy management. Maybe having a clean "Variables" data structure could help improve the situation. Ok! Note that there is something for psql (src/bin/psql/variable.c) which may or may not be shared. It should be checked before recoding eventually the same thing. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello, [...] I didn't make rollbacks to savepoints after the failure because they cannot help for serialization failures at all: after rollback to savepoint a new attempt will be always unsuccessful. Not necessarily? It depends on where the locks triggering the issue are set, if they are all set after the savepoint it could work on a second attempt. "SimpleStats attempts": I disagree with using this floating poiunt oriented structures to count integers. I would suggest "int64 tries" instead, which should be enough for the purpose. I'm not sure that it is enough. Firstly it may be several transactions in script so to count the average attempts number you should know the total number of runned transactions. Secondly I think that stddev for attempts number can be quite interesting and often it is not close to zero. I would prefer to have a real motivation to add this complexity in the report and in the code. Without that, a simple int seems better for now. It can be improved later if the need really arises. Some variables, such as "int attempt_number", should be in the client structure, not in the client? Generally, try to use block variables if possible to keep the state clearly disjoints. If there could be NO new variable at the doCustom level that would be great, because that would ensure that there is no machine state mixup hidden in these variables. Do you mean the code cleanup for doCustom function? Because if I do so there will be two code styles for state blocks and their variables in this function.. I think that any variable shared between state is a recipee for bugs if it is not reset properly, so they should be avoided. Maybe there are already too many of them, then too bad, not a reason to add more. The status before the automaton was a nightmare. I wondering whether the RETRY & FAILURE states could/should be merged: I divided these states because if there's a failed transaction block you should end it before retrying. Hmmm. Maybe I'm wrong. I'll think about it. I would suggest a more compact one-line report about failures: "number of failures: 12 (0.001%, deadlock: 7, serialization: 5)" I think, there may be a misunderstanding. Because script can contain several transactions and get both failures. I do not understand. Both failures number are on the compact line I suggested. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Another detail I forgot about this point: there may be a memory leak on variables copies, ISTM that the "variables" array is never freed. I was not convinced by the overall memory management around variables to begin with, and it is even less so with their new copy management. Maybe having a clean "Variables" data structure could help improve the situation. Ok! -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Here are a round of comments on the current version of the patch: Thank you very much again! There is a latent issue about what is a transaction. For pgbench a transaction is a full script execution. For postgresql, it is a statement or a BEGIN/END block, several of which may appear in a script. From a retry perspective, you may retry from a SAVEPOINT within a BEGIN/END block... I'm not sure how to make general sense of all this, so this is just a comment without attached action for now. Yes it is. That's why I wrote several notes about it in documentation where there may be a misunderstanding: +Transactions with serialization or deadlock failures (or with both +of them if used script contains several transactions; see ++endterm="transactions-and-scripts-title"> for more information) are +marked separately and their time is not reported as for skipped +transactions. + + What is the Transaction Actually Performed in pgbench? +If a transaction has serialization and/or deadlock failures, its + time will be reported as serialization failure, + deadlock failure, or + serialization and deadlock failures, respectively. + + + Transactions can have both serialization and deadlock failures if the + used script contained several transactions. See + for more information. + + + + +The number of transactions attempts within the interval can be greater than +the number of transactions within this interval multiplied by the maximum +attempts number. See for more information. + + + + The total sum of per-command failures of each type can be greater + than the number of transactions with reported failures. + See + endterm="transactions-and-scripts-title"> for more information. + + And I didn't make rollbacks to savepoints after the failure because they cannot help for serialization failures at all: after rollback to savepoint a new attempt will be always unsuccessful. I would consider using "try/tries" instead of "attempt/attempts" as it is shorter. An English native speaker opinion would be welcome on that point. Thank you, I'll change it. I'm fine with renaming "is_latencies" to "report_per_command", which is more logical & generic. Glad to hear it! "max_attempt_number": I'm against typing fields again in their name, aka "hungarian naming". I'd suggest "max_tries" or "max_attempts". Ok! "SimpleStats attempts": I disagree with using this floating poiunt oriented structures to count integers. I would suggest "int64 tries" instead, which should be enough for the purpose. I'm not sure that it is enough. Firstly it may be several transactions in script so to count the average attempts number you should know the total number of runned transactions. Secondly I think that stddev for attempts number can be quite interesting and often it is not close to zero. LastBeginState -> RetryState? I'm not sure why this state is a pointer in CState. Putting the struct would avoid malloc/free cycles. Index "-1" may be used to tell it is not set if necessary. Thanks, I agree that it's better to do in this way. "CSTATE_RETRY_FAILED_TRANSACTION" -> "CSTATE_RETRY" is simpler and clear enough. Ok! In CState and some code, a failure is a failure, maybe one boolean would be enough. It need only be differentiated when counting, and you have (deadlock_failure || serialization_failure) everywhere. I agree with you. I'll change it. Some variables, such as "int attempt_number", should be in the client structure, not in the client? Generally, try to use block variables if possible to keep the state clearly disjoints. If there could be NO new variable at the doCustom level that would be great, because that would ensure that there is no machine state mixup hidden in these variables. Do you mean the code cleanup for doCustom function? Because if I do so there will be two code styles for state blocks and their variables in this function.. I wondering whether the RETRY & FAILURE states could/should be merged: on RETRY: -> count retry -> actually retry if < max_tries (reset client state, jump to command) -> else count failure and skip to end of script The start and end of transaction detection seem expensive (malloc, ...) and assume a one statement per command (what about "BEGIN \; ... \; COMMIT;", which is not necessarily the case, this limitation should be documented. ISTM that the space normalization should be avoided, and something simpler/lighter should be devised? Possibly it should consider handling SAVEPOINT. I divided these states because if there's a failed transaction block you should end it before retrying. It means to go to states CSTATE_START_COMMAND -> CSTATE_WAIT_RESULT -> CSTATE_END_COMMAND with the appropriate command. How do you propose not to go to these states? About
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
LastBeginState -> RetryState? I'm not sure why this state is a pointer in CState. Putting the struct would avoid malloc/free cycles. Index "-1" may be used to tell it is not set if necessary. Another detail I forgot about this point: there may be a memory leak on variables copies, ISTM that the "variables" array is never freed. I was not convinced by the overall memory management around variables to begin with, and it is even less so with their new copy management. Maybe having a clean "Variables" data structure could help improve the situation. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello Marina, There's the second version of my patch for pgbench. Now transactions with serialization and deadlock failures are rolled back and retried until they end successfully or their number of attempts reaches maximum. In details: - You can set the maximum number of attempts by the appropriate benchmarking option (--max-attempts-number). Its default value is 1 partly because retrying of shell commands can produce new errors. - Statistics of attempts and failures is printed in progress, in transaction / aggregation logs and in the end with other results (all and for each script). The transaction failure is reported here only if the last retry of this transaction fails. - Also failures and average numbers of transactions attempts are printed per-command with average latencies if you use the appropriate benchmarking option (--report-per-command, -r) (it replaces the option --report-latencies as I was advised here [1]). Average numbers of transactions attempts are printed only for commands which start transactions. As usual: TAP tests for new functionality and changed documentation with new examples. Here are a round of comments on the current version of the patch: * About the feature There is a latent issue about what is a transaction. For pgbench a transaction is a full script execution. For postgresql, it is a statement or a BEGIN/END block, several of which may appear in a script. From a retry perspective, you may retry from a SAVEPOINT within a BEGIN/END block... I'm not sure how to make general sense of all this, so this is just a comment without attached action for now. As the default is not to retry, which is the upward compatible behavior, I think that the changes should not change much the current output bar counting the number of failures. I would consider using "try/tries" instead of "attempt/attempts" as it is shorter. An English native speaker opinion would be welcome on that point. * About the code ISTM that the code interacts significantly with various patches under review or ready for committers. Not sure how to deal with that, there will be some rebasing work... I'm fine with renaming "is_latencies" to "report_per_command", which is more logical & generic. "max_attempt_number": I'm against typing fields again in their name, aka "hungarian naming". I'd suggest "max_tries" or "max_attempts". "SimpleStats attempts": I disagree with using this floating poiunt oriented structures to count integers. I would suggest "int64 tries" instead, which should be enough for the purpose. LastBeginState -> RetryState? I'm not sure why this state is a pointer in CState. Putting the struct would avoid malloc/free cycles. Index "-1" may be used to tell it is not set if necessary. "CSTATE_RETRY_FAILED_TRANSACTION" -> "CSTATE_RETRY" is simpler and clear enough. In CState and some code, a failure is a failure, maybe one boolean would be enough. It need only be differentiated when counting, and you have (deadlock_failure || serialization_failure) everywhere. Some variables, such as "int attempt_number", should be in the client structure, not in the client? Generally, try to use block variables if possible to keep the state clearly disjoints. If there could be NO new variable at the doCustom level that would be great, because that would ensure that there is no machine state mixup hidden in these variables. I wondering whether the RETRY & FAILURE states could/should be merged: on RETRY: -> count retry -> actually retry if < max_tries (reset client state, jump to command) -> else count failure and skip to end of script The start and end of transaction detection seem expensive (malloc, ...) and assume a one statement per command (what about "BEGIN \; ... \; COMMIT;", which is not necessarily the case, this limitation should be documented. ISTM that the space normalization should be avoided, and something simpler/lighter should be devised? Possibly it should consider handling SAVEPOINT. I disagree about exit in ParseScript if the transaction block is not completed, especially as it misses out on combined statements/queries (BEGIN \; stuff... \; COMMIT") and would break an existing feature. There are strange characters things in comments, eg "??ontinuous". Option "max-attempt-number" -> "max-tries" I would put the client random state initialization with the state intialization, not with the connection. * About tracing Progress is expected to be short, not detailed. Only add the number of failures and retries if max retry is not 1. * About reporting I think that too much is reported. I advised to do that, but nevertheless it is a little bit steep. At least, it should not report the number of tries/attempts when the max number is one. Simple counting should be reported for failures, not floats... I would suggest a more compact one-line report about failures: "number of failures: 12 (0.001%, deadlock: 7,
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello everyone! There's the second version of my patch for pgbench. Now transactions with serialization and deadlock failures are rolled back and retried until they end successfully or their number of attempts reaches maximum. In details: - You can set the maximum number of attempts by the appropriate benchmarking option (--max-attempts-number). Its default value is 1 partly because retrying of shell commands can produce new errors. - Statistics of attempts and failures is printed in progress, in transaction / aggregation logs and in the end with other results (all and for each script). The transaction failure is reported here only if the last retry of this transaction fails. - Also failures and average numbers of transactions attempts are printed per-command with average latencies if you use the appropriate benchmarking option (--report-per-command, -r) (it replaces the option --report-latencies as I was advised here [1]). Average numbers of transactions attempts are printed only for commands which start transactions. As usual: TAP tests for new functionality and changed documentation with new examples. Patch is attached. Any suggestions are welcome! [1] https://www.postgresql.org/message-id/alpine.DEB.2.20.1707031321370.3419%40lancre -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres CompanyFrom 58f51cdc896af801bcd35e495406655ca03aa6ce Mon Sep 17 00:00:00 2001 From: Marina PolyakovaDate: Mon, 10 Jul 2017 13:33:41 +0300 Subject: [PATCH v2] Pgbench Retry transactions with serialization or deadlock errors Now transactions with serialization or deadlock failures can be rolled back and retried again and again until they end successfully or their number of attempts reaches maximum. You can set the maximum number of attempts by the appropriate benchmarking option (--max-attempts-number). Its default value is 1. Statistics of attempts and failures is printed in progress, in transaction / aggregation logs and in the end with other results (all and for each script). The transaction failure is reported here only if the last retry of this transaction fails. Also failures and average numbers of transactions attempts are printed per-command with average latencies if you use the appropriate benchmarking option (--report-per-command, -r). Average numbers of transactions attempts are printed only for commands which start transactions. --- doc/src/sgml/ref/pgbench.sgml | 277 ++-- src/bin/pgbench/pgbench.c | 751 ++--- src/bin/pgbench/t/002_serialization_errors.pl | 121 src/bin/pgbench/t/003_deadlock_errors.pl | 130 src/bin/pgbench/t/004_retry_failed_transactions.pl | 280 5 files changed, 1421 insertions(+), 138 deletions(-) create mode 100644 src/bin/pgbench/t/002_serialization_errors.pl create mode 100644 src/bin/pgbench/t/003_deadlock_errors.pl create mode 100644 src/bin/pgbench/t/004_retry_failed_transactions.pl diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml index 64b043b..dc1daa9 100644 --- a/doc/src/sgml/ref/pgbench.sgml +++ b/doc/src/sgml/ref/pgbench.sgml @@ -49,22 +49,34 @@ transaction type: builtin: TPC-B (sort of) +transaction maximum attempts number: 1 scaling factor: 10 query mode: simple number of clients: 10 number of threads: 1 number of transactions per client: 1000 number of transactions actually processed: 1/1 +number of transactions with serialization failures: 0 (0.000 %) +number of transactions with deadlock failures: 0 (0.000 %) +attempts number average = 1.00 +attempts number stddev = 0.00 tps = 85.184871 (including connections establishing) tps = 85.296346 (excluding connections establishing) - The first six lines report some of the most important parameter + The first seven lines report some of the most important parameter settings. The next line reports the number of transactions completed and intended (the latter being just the product of number of clients and number of transactions per client); these will be equal unless the run failed before completion. (In -T mode, only the actual number of transactions is printed.) + The next four lines report the number of transactions with serialization and + deadlock failures, and also the statistics of transactions attempts. With + such errors, transactions are rolled back and are repeated again and again + until they end sucessufully or their number of attempts reaches maximum (to + change this maximum see the appropriate benchmarking option + --max-attempts-number). The transaction failure is reported here + only if the last retry of this transaction fails. The last two lines report the number of transactions per second, figured with and without counting the time to start database sessions. @@ -434,24 +446,28 @@ pgbench options dbname
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Thu, Jun 15, 2017 at 10:16 PM, Andres Freundwrote: > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote: > > Advanced options: > > - mostly for testing built-in scripts: you can set the default > transaction > > isolation level by the appropriate benchmarking option (-I); > > I'm less convinced of the need of htat, you can already set arbitrary > connection options with > PGOPTIONS='-c default_transaction_isolation=serializable' pgbench > Right, there is already way to specify default isolation level using environment variables. However, once we make pgbench work with various isolation levels, users may want to run pgbench multiple times in a row with different isolation levels. Command line option would be very convenient in this case. In addition, isolation level is vital parameter to interpret benchmark results correctly. Often, graphs with pgbench results are entitled with pgbench command line. Having, isolation level specified in command line would naturally fit into this entitling scheme. Of course, this is solely usability question and it's fair enough to live without such a command line option. But I'm +1 to add this option. -- Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
The number of retries and maybe failures should be counted, maybe with some adjustable maximum, as suggested. If we fix the maximum number of attempts the maximum number of failures for one script execution will be bounded above (number_of_transactions_in_script * maximum_number_of_attempts). Do you think we should make the option in program to limit this number much more? Probably not. I think that there should be a configurable maximum of retries on a transaction, which may be 0 by default if we want to be upward compatible with the current behavior, or maybe something else. I propose the option --max-attempts-number=NUM which NUM cannot be less than 1. I propose it because I think that, for example, --max-attempts-number=100 is better than --max-retries-number=99. And maybe it's better to set its default value to 1 too because retrying of shell commands can produce new errors.. Personnaly, I like counting retries because it also counts the number of time the transaction actually failed for some reason. But this is a marginal preference, and one can be switchted to the other easily. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
The current error handling is either "close connection" or maybe in some cases even "exit". If this is changed, then the client may continue execution in some unforseen state and behave unexpectedly. We'll see. Thanks, now I understand this. ISTM that the retry implementation should be implemented somehow in the automaton, restarting the same script for the beginning. If there are several transactions in this script - don't you think that we should restart only the failed transaction?.. On some transaction failures based on their status. My point is that the retry process must be implemented clearly with a new state in the client automaton. Exactly when the transition to this new state must be taken is another issue. About it, I agree with you that it should be done in this way. The number of retries and maybe failures should be counted, maybe with some adjustable maximum, as suggested. If we fix the maximum number of attempts the maximum number of failures for one script execution will be bounded above (number_of_transactions_in_script * maximum_number_of_attempts). Do you think we should make the option in program to limit this number much more? Probably not. I think that there should be a configurable maximum of retries on a transaction, which may be 0 by default if we want to be upward compatible with the current behavior, or maybe something else. I propose the option --max-attempts-number=NUM which NUM cannot be less than 1. I propose it because I think that, for example, --max-attempts-number=100 is better than --max-retries-number=99. And maybe it's better to set its default value to 1 too because retrying of shell commands can produce new errors.. In doLog, added columns should be at the end of the format. I have inserted it earlier because these columns are not optional. Do you think they should be optional? I think that new non-optional columns it should be at the end of the existing non-optional columns so that existing scripts which may process the output may not need to be updated. Thanks, I agree with you :) I'm not sure that there should be an new option to report failures, the information when relevant should be integrated in a clean format into the existing reports... Maybe the "per command latency" report/option should be renamed if it becomes more general. I have tried do not change other parts of program as much as possible. But if you think that it will be more useful to change the option I'll do it. I think that the option should change if its naming becomes less relevant, which is to be determined. AFAICS, ISTM that new measures should be added to the various existing reports unconditionnaly (i.e. without a new option), so maybe no new option would be needed. Thanks! I didn't think about it in this way.. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello Marina, I agree that improving the error handling ability of pgbench is a good thing, although I'm not sure about the implications... Could you tell a little bit more exactly.. What implications are you worried about? The current error handling is either "close connection" or maybe in some cases even "exit". If this is changed, then the client may continue execution in some unforseen state and behave unexpectedly. We'll see. ISTM that the retry implementation should be implemented somehow in the automaton, restarting the same script for the beginning. If there are several transactions in this script - don't you think that we should restart only the failed transaction?.. On some transaction failures based on their status. My point is that the retry process must be implemented clearly with a new state in the client automaton. Exactly when the transition to this new state must be taken is another issue. The number of retries and maybe failures should be counted, maybe with some adjustable maximum, as suggested. If we fix the maximum number of attempts the maximum number of failures for one script execution will be bounded above (number_of_transactions_in_script * maximum_number_of_attempts). Do you think we should make the option in program to limit this number much more? Probably not. I think that there should be a configurable maximum of retries on a transaction, which may be 0 by default if we want to be upward compatible with the current behavior, or maybe something else. In doLog, added columns should be at the end of the format. I have inserted it earlier because these columns are not optional. Do you think they should be optional? I think that new non-optional columns it should be at the end of the existing non-optional columns so that existing scripts which may process the output may not need to be updated. I'm not sure that there should be an new option to report failures, the information when relevant should be integrated in a clean format into the existing reports... Maybe the "per command latency" report/option should be renamed if it becomes more general. I have tried do not change other parts of program as much as possible. But if you think that it will be more useful to change the option I'll do it. I think that the option should change if its naming becomes less relevant, which is to be determined. AFAICS, ISTM that new measures should be added to the various existing reports unconditionnaly (i.e. without a new option), so maybe no new option would be needed. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello Marina, Hello, Fabien! A few comments about the submitted patches. Thank you very much for them! I agree that improving the error handling ability of pgbench is a good thing, although I'm not sure about the implications... Could you tell a little bit more exactly.. What implications are you worried about? About the "retry" discussion: I agree that retry is the relevant option from an application point of view. I'm glad to hear it! ISTM that the retry implementation should be implemented somehow in the automaton, restarting the same script for the beginning. If there are several transactions in this script - don't you think that we should restart only the failed transaction?.. As pointed out in the discussion, the same values/commands should be executed, which suggests that random generated values should be the same on the retry runs, so that for a simple script the same operations are attempted. This means that the random generator state must be kept & reinstated for a client on retries. Currently the random state is in the thread, which is not convenient for this purpose, so it should be moved in the client so that it can be saved at transaction start and reinstated on retries. I think about it in the same way =) The number of retries and maybe failures should be counted, maybe with some adjustable maximum, as suggested. If we fix the maximum number of attempts the maximum number of failures for one script execution will be bounded above (number_of_transactions_in_script * maximum_number_of_attempts). Do you think we should make the option in program to limit this number much more? About 0001: In accumStats, just use one level if, the two levels bring nothing. Thanks, I agree =[ In doLog, added columns should be at the end of the format. I have inserted it earlier because these columns are not optional. Do you think they should be optional? The number of column MUST NOT change when different issues arise, so that it works well with cut/... unix commands, so inserting a sentence such as "serialization and deadlock failures" is a bad idea. Thanks, I agree again. threadRun: the point of the progress format is to fit on one not too wide line on a terminal and to allow some simple automatic processing. Adding a verbose sentence in the middle of it is not the way to go. I was thinking about it.. Thanks, I'll try to make it shorter. About tests: I do not understand why test 003 includes 2 transactions. It would seem more logical to have two scripts. Ok! About 0003: I'm not sure that there should be an new option to report failures, the information when relevant should be integrated in a clean format into the existing reports... Maybe the "per command latency" report/option should be renamed if it becomes more general. I have tried do not change other parts of program as much as possible. But if you think that it will be more useful to change the option I'll do it. About 0004: The documentation must not be in a separate patch, but in the same patch as their corresponding code. Ok! -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hello Marina, A few comments about the submitted patches. I agree that improving the error handling ability of pgbench is a good thing, although I'm not sure about the implications... About the "retry" discussion: I agree that retry is the relevant option from an application point of view. ISTM that the retry implementation should be implemented somehow in the automaton, restarting the same script for the beginning. As pointed out in the discussion, the same values/commands should be executed, which suggests that random generated values should be the same on the retry runs, so that for a simple script the same operations are attempted. This means that the random generator state must be kept & reinstated for a client on retries. Currently the random state is in the thread, which is not convenient for this purpose, so it should be moved in the client so that it can be saved at transaction start and reinstated on retries. The number of retries and maybe failures should be counted, maybe with some adjustable maximum, as suggested. About 0001: In accumStats, just use one level if, the two levels bring nothing. In doLog, added columns should be at the end of the format. The number of column MUST NOT change when different issues arise, so that it works well with cut/... unix commands, so inserting a sentence such as "serialization and deadlock failures" is a bad idea. threadRun: the point of the progress format is to fit on one not too wide line on a terminal and to allow some simple automatic processing. Adding a verbose sentence in the middle of it is not the way to go. About tests: I do not understand why test 003 includes 2 transactions. It would seem more logical to have two scripts. About 0003: I'm not sure that there should be an new option to report failures, the information when relevant should be integrated in a clean format into the existing reports... Maybe the "per command latency" report/option should be renamed if it becomes more general. About 0004: The documentation must not be in a separate patch, but in the same patch as their corresponding code. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
> To be clear, part of "retrying from the beginning" means that if a> result from one statement is used to determine the content (or> whether to run) a subsequent statement, that first statement must be> run in the new transaction and the results evaluated again to> determine what to use for the later statement. You can't simply> replay the statements that were run during the first try. For> examples, to help get a feel of why that is, see:> > https://wiki.postgresql.org/wiki/SSIThank you again! :))-- Marina PolyakovaPostgres Professional: http://www.postgrespro.comThe Russian Postgres Company
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Fri, Jun 16, 2017 at 5:31 AM, Marina Polyakovawrote: > And thank you very much for your explanation how and why transactions with > failures should be retried! I'll try to implement all of it. To be clear, part of "retrying from the beginning" means that if a result from one statement is used to determine the content (or whether to run) a subsequent statement, that first statement must be run in the new transaction and the results evaluated again to determine what to use for the later statement. You can't simply replay the statements that were run during the first try. For examples, to help get a feel of why that is, see: https://wiki.postgresql.org/wiki/SSI -- Kevin Grittner VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
>> P.S. Does this use case (do not retry transaction with serialization or >> deadlock failure) is most interesting or failed transactions should be >> retried (and how much times if there seems to be no hope of success...)? > > I can't quite parse that sentence, could you restate? The way I read it was that the most interesting solution would retry a transaction from the beginning on a serialization failure or deadlock failure. As far as I understand her proposal, it is exactly the opposite -- if a transaction fails, it is discarded. And this P.S. note is asking whether this is a good idea, or would we prefer that failing transactions are retried. Yes, I have meant this, thank you! I think it's pretty obvious that transactions that failed with some serializability problem should be retried. Thank you voted :) -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
P.S. Does this use case (do not retry transaction with serialization or deadlock failure) is most interesting or failed transactions should be retried (and how much times if there seems to be no hope of success...)? I can't quite parse that sentence, could you restate? The way I read it was that the most interesting solution would retry a transaction from the beginning on a serialization failure or deadlock failure. Most people who use serializable transactions (at least in my experience) run though a framework that does that automatically, regardless of what client code initiated the transaction. These retries are generally hidden from the client code -- it just looks like the transaction took a bit longer. Sometimes people will have a limit on the number of retries. I never used such a limit and never had a problem, because our implementation of serializable transactions will not throw a serialization failure error until one of the transactions involved in causing it has successfully committed -- meaning that the retry can only hit this again on a *new* set of transactions. Essentially, the transaction should only count toward the TPS rate when it eventually completes without a serialization failure. Marina, did I understand you correctly? Álvaro Herrera in next message of this thread has understood my text right: As far as I understand her proposal, it is exactly the opposite -- if a transaction fails, it is discarded. And this P.S. note is asking whether this is a good idea, or would we prefer that failing transactions are retried. And thank you very much for your explanation how and why transactions with failures should be retried! I'll try to implement all of it. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hi, Hello! I think that's a good idea and sorely needed. Thanks, I'm very glad to hear it! - if there were these failures during script execution this "transaction" is marked appropriately in logs; - numbers of "transactions" with these failures are printed in progress, in aggregation logs and in the end with other results (all and for each script); I guess that'll include a "rolled-back %' or 'retried %' somewhere? Not exactly, see documentation: + If transaction has serialization / deadlock failure or them both (last thing + is possible if used script contains several transactions; see +for more information), its + time will be reported as serialization failure / + deadlock failure / + serialization and deadlock failures appropriately. + Example with serialization, deadlock and both these failures: + +1 128 24968 0 1496759158 426984 +0 129 serialization failure 0 1496759158 427023 +3 129 serialization failure 0 1496759158 432662 +2 128 serialization failure 0 1496759158 432765 +0 130 deadlock failure 0 1496759159 460070 +1 129 serialization failure 0 1496759160 485188 +2 129 serialization and deadlock failures 0 1496759160 485339 +4 130 serialization failure 0 1496759160 485465 + I have understood proposals in next messages of this thread that the most interesting case is to retry failed transaction. Do you think it's better to write for example "rolled-back after % retries (serialization failure)' or "time (retried % times, serialization and deadlock failures)'? Advanced options: - mostly for testing built-in scripts: you can set the default transaction isolation level by the appropriate benchmarking option (-I); I'm less convinced of the need of htat, you can already set arbitrary connection options with PGOPTIONS='-c default_transaction_isolation=serializable' pgbench Oh, thanks, I forgot about it =[ P.S. Does this use case (do not retry transaction with serialization or deadlock failure) is most interesting or failed transactions should be retried (and how much times if there seems to be no hope of success...)? I can't quite parse that sentence, could you restate? Álvaro Herrera later in this thread has understood my text right: As far as I understand her proposal, it is exactly the opposite -- if a transaction fails, it is discarded. And this P.S. note is asking whether this is a good idea, or would we prefer that failing transactions are retried. With his explanation has my text become clearer? -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Thu, Jun 15, 2017 at 4:18 PM, Alvaro Herrerawrote: > Kevin Grittner wrote: > As far as I understand her proposal, it is exactly the opposite -- if a > transaction fails, it is discarded. And this P.S. note is asking > whether this is a good idea, or would we prefer that failing > transactions are retried. > > I think it's pretty obvious that transactions that failed with > some serializability problem should be retried. Agreed all around. -- Kevin Grittner VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Fri, Jun 16, 2017 at 9:18 AM, Alvaro Herrerawrote: > Kevin Grittner wrote: >> On Thu, Jun 15, 2017 at 2:16 PM, Andres Freund wrote: >> > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote: > >> >> P.S. Does this use case (do not retry transaction with serialization or >> >> deadlock failure) is most interesting or failed transactions should be >> >> retried (and how much times if there seems to be no hope of success...)? >> > >> > I can't quite parse that sentence, could you restate? >> >> The way I read it was that the most interesting solution would retry >> a transaction from the beginning on a serialization failure or >> deadlock failure. > > As far as I understand her proposal, it is exactly the opposite -- if a > transaction fails, it is discarded. And this P.S. note is asking > whether this is a good idea, or would we prefer that failing > transactions are retried. > > I think it's pretty obvious that transactions that failed with > some serializability problem should be retried. +1 for retry with reporting of retry rates -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Kevin Grittner wrote: > On Thu, Jun 15, 2017 at 2:16 PM, Andres Freundwrote: > > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote: > >> P.S. Does this use case (do not retry transaction with serialization or > >> deadlock failure) is most interesting or failed transactions should be > >> retried (and how much times if there seems to be no hope of success...)? > > > > I can't quite parse that sentence, could you restate? > > The way I read it was that the most interesting solution would retry > a transaction from the beginning on a serialization failure or > deadlock failure. As far as I understand her proposal, it is exactly the opposite -- if a transaction fails, it is discarded. And this P.S. note is asking whether this is a good idea, or would we prefer that failing transactions are retried. I think it's pretty obvious that transactions that failed with some serializability problem should be retried. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Thu, Jun 15, 2017 at 2:16 PM, Andres Freundwrote: > On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote: >> I suggest a patch where pgbench client sessions are not disconnected because >> of serialization or deadlock failures and these failures are mentioned in >> reports. > > I think that's a good idea and sorely needed. +1 >> P.S. Does this use case (do not retry transaction with serialization or >> deadlock failure) is most interesting or failed transactions should be >> retried (and how much times if there seems to be no hope of success...)? > > I can't quite parse that sentence, could you restate? The way I read it was that the most interesting solution would retry a transaction from the beginning on a serialization failure or deadlock failure. Most people who use serializable transactions (at least in my experience) run though a framework that does that automatically, regardless of what client code initiated the transaction. These retries are generally hidden from the client code -- it just looks like the transaction took a bit longer. Sometimes people will have a limit on the number of retries. I never used such a limit and never had a problem, because our implementation of serializable transactions will not throw a serialization failure error until one of the transactions involved in causing it has successfully committed -- meaning that the retry can only hit this again on a *new* set of transactions. Essentially, the transaction should only count toward the TPS rate when it eventually completes without a serialization failure. Marina, did I understand you correctly? -- Kevin Grittner VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Hi, On 2017-06-14 11:48:25 +0300, Marina Polyakova wrote: > Now in pgbench we can test only transactions with Read Committed isolation > level because client sessions are disconnected forever on serialization > failures. There were some proposals and discussions about it (see message > here [1] and thread here [2]). > I suggest a patch where pgbench client sessions are not disconnected because > of serialization or deadlock failures and these failures are mentioned in > reports. I think that's a good idea and sorely needed. In details: > - if there were these failures during script execution this "transaction" is > marked > appropriately in logs; > - numbers of "transactions" with these failures are printed in progress, in > aggregation logs and in the end with other results (all and for each > script); I guess that'll include a "rolled-back %' or 'retried %' somewhere? > Advanced options: > - mostly for testing built-in scripts: you can set the default transaction > isolation level by the appropriate benchmarking option (-I); I'm less convinced of the need of htat, you can already set arbitrary connection options with PGOPTIONS='-c default_transaction_isolation=serializable' pgbench > P.S. Does this use case (do not retry transaction with serialization or > deadlock failure) is most interesting or failed transactions should be > retried (and how much times if there seems to be no hope of success...)? I can't quite parse that sentence, could you restate? - Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Sounds like a good idea. Thank you! Please add to the next CommitFest Done: https://commitfest.postgresql.org/14/1170/ and review somebody else's patch in exchange for having your own patch reviewed. Of course, I remember about it. -- Marina Polyakova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
On Wed, Jun 14, 2017 at 4:48 AM, Marina Polyakovawrote: > Now in pgbench we can test only transactions with Read Committed isolation > level because client sessions are disconnected forever on serialization > failures. There were some proposals and discussions about it (see message > here [1] and thread here [2]). > > I suggest a patch where pgbench client sessions are not disconnected because > of serialization or deadlock failures and these failures are mentioned in > reports. In details: > - transaction with one of these failures continue run normally, but its > result is rolled back; > - if there were these failures during script execution this "transaction" is > marked > appropriately in logs; > - numbers of "transactions" with these failures are printed in progress, in > aggregation logs and in the end with other results (all and for each > script); > > Advanced options: > - mostly for testing built-in scripts: you can set the default transaction > isolation level by the appropriate benchmarking option (-I); > - for more detailed reports: to know per-statement serialization and > deadlock failures you can use the appropriate benchmarking option > (--report-failures). > > Also: TAP tests for new functionality and changed documentation with new > examples. > > Patches are attached. Any suggestions are welcome! Sounds like a good idea. Please add to the next CommitFest and review somebody else's patch in exchange for having your own patch reviewed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers