Re: WIP Patch: Pgbench Serialization and deadlock errors

Marina Polyakova Thu, 11 Jan 2018 04:42:55 -0800

On 03-01-2018 19:28, Fabien COELHO wrote:

Hello Marina,


Hello!

Some comment about WIP pgbench error handling v4.


Many thanks!

Patch applies, compiles, global & local "make check" are ok. doccompiles.
I'm generally okay with having such a feature, but I'd like it to be
*MUCH* simpler, otherwise it is going to make pgbench
unmaintainable:-(

I understand your concern about this, but what exactly do you want tosimplify (except for compound commands, the option --debug-fails andprocessing the end of the failed transaction block, because you discussthem later)?

Also, ISTM that a new patch should address somehow (in the code or
with explanation about why you disagree) all comments from the
previous review, which is not really the case here, as I have to
repeat some of the comments I did on v3. You should answer to the
previous comment mail and tag all comments with "ok", "no because this
or that", "done", "postponed because this or that...".


1) You can read my answer to your previous review [1]:

- if there's a short answer like "Ok!", "Thanks!", "Sorry, thanks","Thanks, I agree" or "Thanks, I'll fix it", it means "done";- if there's an open question, it means that it is postponed because Idon't know exactly how to do it.

2) The second note of the --max-tries documentation. I wrote you asuggestion in [1] and included it in the patch because you did notanswer me about it:

I do not understand the second note of the --max-tries documentation.
It seems to suggest that some script may not end their own
transaction...
which should be an error in my opinion? Some explanations would be
welcome.


As you told me here [1], "I disagree about exit in ParseScript if the
transaction block is not completed <...> and would break an existing
feature.". Maybe it's be better to say this:

In pgbench you can use scripts in which the transaction blocks do not
end. Be careful in this case because transactions that span over more
than one script are not rolled back and will not be retried in case of
an error. In such cases, the script in which the error occurred is
reported as failed.

?


3) About the tests. I wrote to you why they are so heavy in [1]:

The tap tests seems over-complicated and heavy with two pgbench run in
parallel... I'm not sure we really want all that complexity for this
somehow small feature. Moreover pgbench can run several scripts, I'm
not
sure why two pgbench would need to be invoked. Could something much
simpler and lighter be proposed instead to test the feature?


Firstly, two pgbench need to be invoked because we don't know which of
them will get a deadlock failure. Secondly, I tried much simplier tests
but all of them failed sometimes although everything was ok:
- tests in which pgbench runs 5 clients and 10 transactions per client
for a serialization/deadlock failure on any client (sometimes there are
no failures when it is expected that they will be)

- tests in which pgbench runs 30 clients and 400 transactions perclient

for a serialization/deadlock failure on any client (sometimes there are
no failures when it is expected that they will be)
- tests in which the psql session starts concurrently and you use sleep
commands to wait pgbench for 10 seconds (sometimes it does not work)
Only advisory locks help me not to get such errors in the tests :(

+ as I wrote in [2], they became lighter (28 simple tests instead of 57,8 runs of pgbench instead of 17).

About the documentation:
Again, as already said in the previous review, please take into accountcommentsor justify why you do not do so, I do not think that this featureshould begiven any pre-emminence: most of the time performance testing is aboutall-is-well transactions which do not display any error. I do not claim thatit is nota useful feature, on the contrary I do think that testing under errorconditionsis a good capability, but I just insist that it is a on the sidefeatureshould not be put forward. As a consequence, the "maximum number oftransactiontries" should certainly not be the default first output of a summaryrun.

I agree with you that there should be no pre-emminence for my feature.Here there're my reasons for the order in the reports (except for the"maximum number of transaction tries", because you discuss it later):

1) Per-statement report: errors and retries are printed at the end.

2) Other reports (main report, per-script report, progress report,transaction logging, aggregation logging):- errors are printed before optional results because you don't need anyoptions for getting errors;- retries are printed at the end because they need the option--max-tries.

I'm unclear about the added example added in the documentation. There
are 71% errors, but 100% of transactions are reported as processed. If

there were errors, then it is not a success, so the transaction werenotprocessed? To me it looks inconsistent. Also, while testing, it seemsthatfailed transactions are counted in tps, which I think is notappropriate:



About the feature:

 sh> PGOPTIONS='-c default_transaction_isolation=serializable' \
       ./pgbench -P 1 -T 3 -r -M prepared -j 2 -c 4
 starting vacuum...end.
 progress: 1.0 s, 10845.8 tps, lat 0.091 ms stddev 0.491, 10474 failed
 # NOT 10845.8 TPS...
 progress: 2.0 s, 10534.6 tps, lat 0.094 ms stddev 0.658, 10203 failed
 progress: 3.0 s, 10643.4 tps, lat 0.096 ms stddev 0.568, 10290 failed
 ...
 number of transactions actually processed: 32028 # NO!
 number of errors: 30969 (96.694 %)
 latency average = 2.833 ms
 latency stddev = 1.508 ms
 tps = 10666.720870 (including connections establishing) # NO
 tps = 10683.034369 (excluding connections establishing) # NO
 ...

For me this is all wrong. I think that the tps report is abouttransactionsthat succeeded, not mere attempts. I cannot say that a transactionwhich aborted

was "actually processed"... as it was not.


Thank you, I agree and I will fix it.

The order of reported elements is not logical:

 maximum number of transaction tries: 100
 scaling factor: 10
 query mode: prepared
 number of clients: 4
 number of threads: 2
 duration: 3 s
 number of transactions actually processed: 967
 number of errors: 152 (15.719 %)
 latency average = 9.630 ms
 latency stddev = 13.366 ms
 number of transactions retried: 623 (64.426 %)
 number of retries: 32272

I would suggest to group everything about error handling in one block,
eg something like:

 scaling factor: 10
 query mode: prepared
 number of clients: 4
 number of threads: 2
 duration: 3 s
 number of transactions actually processed: 967
 number of errors: 152 (15.719 %)
 number of transactions retried: 623 (64.426 %)
 number of retries: 32272
 maximum number of transaction tries: 100
 latency average = 9.630 ms
 latency stddev = 13.366 ms

Do you think this is ok when only errors and the maximum number oftransaction tries are printed. (because retries are printed if they arenot-zero)? We can get something like this:


scaling factor: 10
query mode: prepared
number of clients: 4
number of threads: 2
duration: 3 s
number of transactions actually processed: 967
number of errors: 152 (15.719 %)
maximum number of transaction tries: 1 # default value
latency average = 9.630 ms
latency stddev = 13.366 ms

or this:

scaling factor: 10
query mode: prepared
number of clients: 4
number of threads: 2
duration: 3 s
number of transactions actually processed: 967
number of errors: 152 (15.719 %)
maximum number of transaction tries: 3 # not the default value
latency average = 9.630 ms
latency stddev = 13.366 ms

Also, percent character should be stuck to its number: 15.719% to have
the style more homogeneous (although there seems to be pre-existing
inhomogeneities).
I would replace "transaction tries/retried" by "tries/retried",everything
is about transactions in the report anyway.
Without reading the documentation, the overall report semantics isunclear,especially given the absurd tps results I got with the my firstattempt,
as failing transactions are counted as "processed".


I agree and I will fix it.

For the detailed report, ISTM that I already said in the previousreview that

the 22 columns (?) indentation is much too large:

 0.078                   149                 30886  UPDATE
pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;


You did not answer me in [1] about this:

Moreover the 25 characters
alignment is ugly, better use a much smaller alignment.


The variables for the numbers of failures and retries are of type int64
since the variable for the total number of transactions has the same
type. That's why such a large alignment (as I understand it now, enough
20 characters). Do you prefer floating alignemnts, depending on the
maximum number of failures/retries for any command in any script?

When giving the number of retries, I'd like to also have the averagenumber ofretried per attempted transactions, whether they succeeded or failed inthe end.

Is this not a number of retried divided by the number of attemptedtransactions (this is now printed in the main report)? Or do you meanthe average number of retried transactions per script run?

About the code:
I'm at lost with the 7 states added to the automaton, where I wouldhave hoped
that only 2 (eg RETRY & FAIL, or even less) would be enough.

Well, I will try to simplify it. As I can see now, there should be aseparate code if we end a failed transaction block:- if we try to end a failed transaction block and a failure occurs, thisfailure cannot be retried in any case;- if we cannot retry a failure and there was a failed transaction block,we must go to the next command after it, and not to the next commandafter the failure.

I would hope for something like "if (some error) state = ERROR",
then in "ERROR" do the stats, check whether it should be retried, andif
so state = START... and we are done.


We discussed this question in [3] and [4]:

I wondering whether the RETRY & FAILURE states could/should bemerged:


  on RETRY:
    -> count retry
    -> actually retry if < max_tries (reset client state, jump to
command)
    -> else count failure and skip to end of script

The start and end of transaction detection seem expensive (malloc,
...) and assume a one statement per command (what about "BEGIN \; ...

\; COMMIT;", which is not necessarily the case, this limitationshould

be documented. ISTM that the space normalization should be avoided,
and something simpler/lighter should be devised? Possibly it should
consider handling SAVEPOINT.

I divided these states because if there's a failed transaction blockyou

should end it before retrying. It means to go to states
CSTATE_START_COMMAND -> CSTATE_WAIT_RESULT -> CSTATE_END_COMMAND with
the appropriate command. How do you propose not to go to these states?


Hmmm. Maybe I'm wrong. I'll think about it.

I'm wondering whether the whole feature could be simplified by
considering that one script is one "transaction" (it is from the
report point of view at least), and that any retry is for the full
script only, from its beginning. That would remove the trying to guess
at transactions begin or end, avoid scanning manually for subcommands,
and so on.
 - Would it make sense?
 - Would it be ok for your use case?

I'm not sure if this makes sense: if we retry the script from the verybeginning including successful transactions, there may be errors..

The proposed version of the code looks unmaintainable to me. There are
3 levels of nested "switch/case" with state changes at the deepestlevel.
I cannot even see it on my screen which is not wide enough.

I understand your concern, although the 80-column limit breaks only forlong strings for printf functions. I will try to simplify this bytransferring the 2nd and the 3rd levels of nested "switch/case" into aseparate function.

There should be a typedef for "random_state", eg something like:

  typedef struct { unsigned short data[3]; } RandomState;

Please keep "const" declarations, eg "commandFailed".
I think that choosing script should depend on the thread random state,notthe client random state, so that a run would generate the same patternper
thread, independently of which client finishes first.
I'm sceptical of the "--debug-fails" options. ISTM that --debug isalready there
and should just be reused.


Thanks, I agree with you and I'll fix this.

(Therefore I assume that both the thread state and the client state willhave a random state.)

How do you like the idea of creating several levels for debugging?

Also, you have changed some existing error unrelated

error messages with this option, especially in evalFunc, which isclearly not

appropriate:

 -    fprintf(stderr, "random range is too large\n");
 +    if (debug_fails)
 +        fprintf(stderr, "random range is too large\n");

Please let unrelated code as is.

I suppose this is a related code. If we do not change this, a programrun that does not use debugging options can receive many error messageswithout aborting any clients. + it will be very difficult to get reportsof progress among all this.

I agree that function naming style is a already a mess, but I thinkthat
new functions you add should use a common style, eg "is_compound" vs
"canRetry".

Translating error strings to their enum should be put in a function.


Ok, I will fix it.

I do not believe in the normalize_whitespace function: ISTM that it
would turn  "SELECT LENGTH('    ');" to "SELECT LENGTH(' ');", which
is not desirable. I do not think that it should be needed.

This function does not change the commands that are passed to theserver; this just simplifies the function is_tx_block_end and makes thelatter more understandable.

I do not understand the "get subcommands of a compound command" strange
re-parsing phase. There should be comments explaining what you want to
achieve.


Well, I will write these comments. My reasons are as follows:

1) I need to know the number of non-empty subcommands for processing thecommand in CSTATE_WAIT_RESULT;2) I need this parser to correctly distinguish subcommands in such casesas "SELECT ';';";3) for any non-empty subcommand I need to know its initial sequencenumber, it is used to report errors/failures;4) if the compound command contains only one non-empty subcommand (forexample, the compound command ';;END;;'), it can be retried as a usualcommand;5) retrying of failures in the compound command depends on itssubcommands which start or complete the transaction block.

I'm not sure about what happens if several begin/end appears
on the line.


Maybe this section of the documentation [5] will help you?

I'm not sure this whole thing should be done anyway.

Well, I will simplify it so the compound commands will not be retried inany case; but even now the parser is needed for reasons 1, 2 and 3 inthe list above ("My reasons are as follows: ...").

About the tests:

The 828 lines perl script seems over complicated, with pgbench & psql
interacting together... Is all that really necessary? Isn't some (much)simpler
test possible and would be sufficient?

I wrote to you about this at the beginning of this letter ("3) About thetests. ...").

The "node" is started but never stopped.


Thank you, I will fix it.

For file contents, maybe the << 'EOF' here-document syntax would helpinstead
of using concatenated backslashed strings everywhere.

Thanks, but I'm not sure that it is better to open file handlers forprinting in variables..

[1]https://www.postgresql.org/message-id/e4c5e8cefa4a8e88f1273b0f1ee29e56%40postgrespro.ru[2]https://www.postgresql.org/message-id/894ac29c18f12a4fd6f8c97c9123a152%40postgrespro.ru[3]https://www.postgresql.org/message-id/6ce8613f061001105654673506348c13%40postgrespro.ru[4]https://www.postgresql.org/message-id/alpine.DEB.2.20.1707131818200.20175%40lancre[5]https://www.postgresql.org/docs/devel/static/protocol-flow.html#PROTOCOL-FLOW-MULTI-STATEMENT


--
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: WIP Patch: Pgbench Serialization and deadlock errors

Reply via email to