Re: [HACKERS] pgbench - allow to store select results into variables

Fabien COELHO Sun, 29 Jan 2017 00:16:51 -0800


<APOLOGY>
  Please pardon the redondance: this is a slightly edited repost
  from another thread where motivation for this patch was discussed, so
  that it appear in the relevant thread.
</APOLOGY>

Tom> [...] there was immediately objection as to whether his idea of TPC-BTom> compliance was actually right.

From my point of view TPC-* are simply objective examples of typical

benchmark requirements to show which features are needed in a tool fordoing this activity. Once features are available, I think that pgbenchshould also be a show-case for their usage. Currently a few functions (forimplementing the bench as specified) and actually extracting results intovariables (for suspicious auditors and bench relevance, see below) aremissing.

Tom> I remember complaining that he had a totally artificial idea of whatTom> "fetching a data value" requires.


Yep.

I think that the key misunderstanding is that you are honest and assumethat other people are honest too. This is naïve: There is a long historyof vendors creatively "cheating" to get better than deserve benchmarkresults. Benchmark specifications try to prevent such behaviors by layingcareful requirements and procedures.

In this instance, you "know" that when pg has returned the result of thequery the data is actually on the client side, so you considered it isfetched. That is fine for you, but from a benchmarking perspective withexternal auditors your belief/knowledge is not good enough.

For instance, the vendor could implement a new version of the protocolwhere the data are only transfered on demand, and the result just tellsthat the data is indeed somewhere on the server (eg on "SELECT abalance"it could just check that the key exists, no need to actually fetch thedata from the table, so no need to read the table, the index isenough...). That would be pretty stupid for real application performance,but the benchmark would get better tps by doing so.

Without even intentionnaly cheating, this could be part of a useful"streaming mode" protocol option which make sense for very large resultsbut would be activated for a small result.

Another point is that decoding the message may be a little expensive, sothat by not actually extracting the data into the client but just keepingit in the connection/OS one gets better performance.


Thus, TPC-B 2.0.0 benchmark specification says:

"1.3.2 Each transaction shall return to the driver the Account_Balanceresulting from successful commit of the transaction.

Comment: It is the intent of this clause that the account balance in thedatabase be returned to the driver, i.e., that the application retrievethe account balance."

For me the correct interpretation of "the APPLICATION retrieve the accountbalance" is that the client application code, pgbench in this context, didindeed get the value from the vendor code, here "libpq" which is handlingthe connection.

Having the value discarded from libpq by calling PQclear instead ofPQntuples/PQgetvalue/... skips a key part of the client code that no realapplication would skip. This looks strange and is not representative ofreal client code: as a potential auditor, because of this performanceimpact doubt and lack of relevance, I would not check the correspondingitem in the audit check list:


  "11.3.1.2 Verify that transaction inputs and outputs satisfy Clause 1.3."

So the benchmark implementation would not be validated.

Another trivial reason to be able to actually retrieve data is that forbenchmarking purpose it is very easy to want to test a scenario where youdo different things based on data received, which imply that the data canbe manipulated somehow on the benchmarking client side, which is currentlynot possible.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench - allow to store select results into variables

Reply via email to