Re: 2018-03 Commitfest Summary (Andres #1)

Fabien COELHO Fri, 02 Mar 2018 01:48:07 -0800


Hello andres & Tom,

 A bit concerned that we're turning pgbench into a kitchen sink.


I do not understand "kitchen sink" expression in this context, and your
general concerns about pgbench in various comments in your message.


We're adding a lot of stuff to pgbench that only a few people
use. There's a lot of duplication with similar parts of code in other
parts of the codebase. pgbench in my opinion is a tool to facilitate
postgres development, not a goal in itself.


I disagree.

I think that pgbench should *also* allow to test postgres performance inrealistic scenarii that allow to communicate about performance, andreassure users about their use case, not just a simplified tpc-b.

Even if you would want to restrict it to internal postgres development,which I would see as a shame, recently added features are still useful.

For instance, I used extensively tps throttling, latencies and timeoutsmeasures when developping and testing the checkpointer sorting &throttling patch.

Some people are just proposing a new storage engine which changes the costof basic operations (improves committed transaction, makes rolling-backmore expensive). What is the actual impact depending on the rollback rate?How do you plan to measure that? Pgbench needs capacities to be usefulthere, and the good news is that some recently added ones would comehandy.

It's a bad measure, but the code growth shows my concerns somewhat:
master:        5660 +817
REL_10_STABLE: 4843 +266
REL9_6_STABLE: 4577 +424
REL9_5_STABLE: 4153 +464
REL9_4_STABLE: 3689 +562
REL9_3_STABLE: 3127 +338
REL9_2_STABLE: 2789 +96
REL9_1_STABLE: 2693

A significant part of this growth is the expression engine, which ismostly trivial code, although alas not necessarily devout of bugs. Ifmoved to fe-utils, pgbench code footprint will be reduced by about 2000lines.

Also, code has been removed (eg the fork-based implementation) andsignificant restructuring which has greatly improved code maintenance,even if the number of lines has possibly increased in passing.

So this setting-variable-from-query patch goes with having boolean
expressions (already committed), having conditions (\if in the queue),
improving the available functions (eg hashes, in the queue)... so that
existing, data-dependent, realistic benchmarks can be implemented, and
benefit for the great performance data collection provided by the tool.


I agree that they're useful in a few cases, but they have to consider
that they need to be reviewed and maintained an the project is quite
resource constrained in that regard.

Currently I do most of the reviewing & maintenance of pgbench, apart fromthe patch I submit.

I can stop doing both if the project decides that improving pgbenchcapabilities is against its interest.


Tom said:

FWIW, I share Andres' concern that pgbench is being extended far pastwhat anyone has shown a need for. If we had infinite resources
this wouldn't be a big problem, but it's eating into limited
committer hours and I'm not really convinced that we're gettingadequate return.

As pgbench patches can stay ready-to-committers for half a dozen CF, I'mnot sure the strain on the committer time is that heavy:-) There are notso many of them, most of them are trivial. If you drop them on the groundthat the you do not want them, it will not change anything to the lack ofreviewing resources and incapacity of the project to process the submittedpatches, which in my opinion is a wider issue, not related to the fewpgbench-related submissions.

On the "adequate return" point, my opinion is that currently pgbench isjust below the feature set needed to be generally usable, so not improvingit is a self-fullfilling ensurance that it will not be used further. Oncethe "right" feature set is reached (for me, at least extracting queryoutput into variables, having conditionals, possibly a few more functionsif some benches use them), whether it would be actually more widely usedby both developers and users is an open question.

Now, as I said, if pgbench improvements are not seen as desirable, I canmark submissions as "rejected" and do other things with my littleavailable time than try to contribute to postgres.

- pgbench - test whether a variable exists


As already said, the motivation is that it is a preparation for a (much)
larger patch which would move pgbench expressions to fe utils and use them
in "psql".


You could submit it together with that.

Sure, I could. My previous experience is that maintaining a set ofdependent patches is tiresome, and it does not help much with testing andreviewing either. So I'm doing things one (slow) step at a time,especially as each time I've submitted patches which were doing more thanone thing I was asked to disentangle features and restructuring.

But I don't see in the first place why we need to add the feature withduplicate code, just so we can unify.

It is not duplicate code. In psql the variable-exists-test is currentlyperformed on the fly by the lexer. With the expression engine, it needs tobe lexed, parsed and finally evaluated, so this is necessarily new code.

We can gain it via the unification, no?

Well, this would be a re-implementation anyway. I'm not sure the old onewould disappear completly, because it depends on backslash commands whichhave different lexing assumptions (eg currently the variable-exists-testis performed from both "psqlscan.l" and "psqlscanslash.l" independently).


--
Fabien.

Re: 2018-03 Commitfest Summary (Andres #1)

Reply via email to