Re: [HACKERS] pgbench regression test failure

2017-11-21 Thread Tom Lane
Robert Haas  writes:
> On Tue, Nov 21, 2017 at 3:28 PM, Tom Lane  wrote:
>> or just
>> ... latency limit: 33/33 (100.000 %)

> Oh, yeah.  That last one sounds good; no reason to print the same
> value more than once.

Sold; I'll go make it so.

regards, tom lane



Re: [HACKERS] pgbench regression test failure

2017-11-21 Thread Robert Haas
On Tue, Nov 21, 2017 at 3:28 PM, Tom Lane  wrote:
> Seems like a good idea, but the way you've written it is inconsistent
> with the "n/m" notation used just above.  I'd suggest
>
> ... latency limit: 33 (33/33, 100.000 %)
>
> or just
>
> ... latency limit: 33/33 (100.000 %)

Oh, yeah.  That last one sounds good; no reason to print the same
value more than once.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pgbench regression test failure

2017-11-21 Thread Tom Lane
Robert Haas  writes:
> On Mon, Nov 20, 2017 at 1:40 PM, Tom Lane  wrote:
>> I dunno, it just looks odd to me that when we've set up a test case in
>> which every one of the transactions is guaranteed to exceed the latency
>> limit, that it doesn't say that they all did.  I don't particularly buy
>> your assumption that the percentages should sum.  Anybody else have an
>> opinion there?

> I agree with you, but I don't think either approach is free from
> possible confusion.  I think it would help to show the numerator and
> the denominator explicitly, e.g.:

> number of clients: 1
> number of threads: 1
> number of transactions per client: 100
> number of transactions actually processed: 33/100
> number of transactions skipped: 67 (67.000 %)
> number of transactions above the 1.0 ms latency limit: 33 (33 of 33, 100.000 
> %)

> (My proposed change is in the last line.)

Seems like a good idea, but the way you've written it is inconsistent
with the "n/m" notation used just above.  I'd suggest

... latency limit: 33 (33/33, 100.000 %)

or just

... latency limit: 33/33 (100.000 %)

regards, tom lane



Re: [HACKERS] pgbench regression test failure

2017-11-21 Thread Robert Haas
On Mon, Nov 20, 2017 at 1:40 PM, Tom Lane  wrote:
> I dunno, it just looks odd to me that when we've set up a test case in
> which every one of the transactions is guaranteed to exceed the latency
> limit, that it doesn't say that they all did.  I don't particularly buy
> your assumption that the percentages should sum.  Anybody else have an
> opinion there?

I agree with you, but I don't think either approach is free from
possible confusion.  I think it would help to show the numerator and
the denominator explicitly, e.g.:

number of clients: 1
number of threads: 1
number of transactions per client: 100
number of transactions actually processed: 33/100
number of transactions skipped: 67 (67.000 %)
number of transactions above the 1.0 ms latency limit: 33 (33 of 33, 100.000 %)

(My proposed change is in the last line.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] pgbench regression test failure

2017-11-20 Thread Fabien COELHO


Hello Tom,


2. ISTM that we should report that 100% of the transactions were
above the latency limit, not 33%; that is, the appropriate base
for the "number of transactions above the latency limit" percentage
is the number of actual transactions not the number of scheduled
transactions.



Hmmm. Allow me to disagree.


I dunno, it just looks odd to me that when we've set up a test case in
which every one of the transactions is guaranteed to exceed the latency
limit, that it doesn't say that they all did.  I don't particularly buy
your assumption that the percentages should sum.


This is a side effect. The reason for me is that the user asked for some 
transactions, and the results should be given relative to what was asked.



Anybody else have an opinion there?


Good question.


I also noticed that if I specify "-f sleep-100.sql" more than once,
the per-script TPS reports are out of line.  This is evidently because
that calculation isn't excluding skipped xacts; but if we're going to
define tps as excluding skipped xacts, surely we should do so there too.



I do not think that we should exclude skipped xacts.


Uh ... why not?


Because I totally misinterpreted your sentence.

Indeed, the skipped transactions needs to be substracted from the count.
This is yet another bug.


but I find that unduly optimistic.  It should really read more like
"if no transactions were executed, at best we'll get some platform-
dependent spelling of NaN.  At worst we'll get a SIGFPE."



Hmmm. Alas you must be right about spelling. There has been no report of
SIGFPE issue, so I would not bother with that.


The core issue here really is that you're assuming IEEE float arithmetic.
We have not gone as far as deciding that Postgres will only run on IEEE
hardware, and I don't want to start in pgbench, especially not in
seldom-exercised corner cases.


Hmmm. It has already started for some years without complaint. Do as you 
feel about NaN. I can only say that I do not like much having zero to 
stand for undefined.


--
Fabien.



Re: [HACKERS] pgbench regression test failure

2017-11-20 Thread Tom Lane
Fabien COELHO  writes:
>> 1. The per-script stats shouldn't be printed at all if there's
>> only one script.  They're redundant with the overall stats.

> Indeed.
> I think the output should tend to be the same for possible automatic 
> processing, whether there is one script or more, even at the price of some 
> redundancy.
> Obviously this is highly debatable.

I think that ship has already sailed.  It's certainly silly that the
code prints *only* per-script latency stats, and not all the per-script
stats, when there's just one script.  To me the answer is to make the
latency stats conform to the rest, not make the printout even more
redundant.  None of this output was designed for machine-friendliness.

(Maybe there is an argument for a "--machine-readable-output" switch
that would dump the data in some more-machine-friendly format.  Though
I'm sure we'd have lots of debates about exactly what that is...)

>> 2. ISTM that we should report that 100% of the transactions were
>> above the latency limit, not 33%; that is, the appropriate base
>> for the "number of transactions above the latency limit" percentage
>> is the number of actual transactions not the number of scheduled
>> transactions.

> Hmmm. Allow me to disagree.

I dunno, it just looks odd to me that when we've set up a test case in
which every one of the transactions is guaranteed to exceed the latency
limit, that it doesn't say that they all did.  I don't particularly buy
your assumption that the percentages should sum.  Anybody else have an
opinion there?

>> I also noticed that if I specify "-f sleep-100.sql" more than once,
>> the per-script TPS reports are out of line.  This is evidently because
>> that calculation isn't excluding skipped xacts; but if we're going to
>> define tps as excluding skipped xacts, surely we should do so there too.

> I do not think that we should exclude skipped xacts.

Uh ... why not?

>> I'm also still exceedingly unhappy about the NaN business.
>> You have this comment in printSimpleStats:
>> /* print NaN if no transactions where executed */
>> but I find that unduly optimistic.  It should really read more like
>> "if no transactions were executed, at best we'll get some platform-
>> dependent spelling of NaN.  At worst we'll get a SIGFPE."

> Hmmm. Alas you must be right about spelling. There has been no report of 
> SIGFPE issue, so I would not bother with that.

The core issue here really is that you're assuming IEEE float arithmetic.
We have not gone as far as deciding that Postgres will only run on IEEE
hardware, and I don't want to start in pgbench, especially not in
seldom-exercised corner cases.

regards, tom lane



Re: [HACKERS] pgbench regression test failure

2017-11-20 Thread Fabien COELHO


Hello Tom,

Thanks for having a look at this bug fix.


So we fixed the reported TPS rate, which was nowhere near reality,
and the per-script stats are sane now.  Good so far, but this
still has two problems IMO:

1. The per-script stats shouldn't be printed at all if there's
only one script.  They're redundant with the overall stats.


Indeed.

I think the output should tend to be the same for possible automatic 
processing, whether there is one script or more, even at the price of some 
redundancy.


Obviously this is highly debatable.


2. ISTM that we should report that 100% of the transactions were
above the latency limit, not 33%; that is, the appropriate base
for the "number of transactions above the latency limit" percentage
is the number of actual transactions not the number of scheduled
transactions.


Hmmm. Allow me to disagree.

If the user asked for something, then this is the reference to take and 
the whole report should be relative to that. Also, changing this makes the 
report figures not adding up:


before:
  number of transactions skipped: 417 (90.260 %)
  number of transactions above the 1.0 ms latency limit: 45 (9.740 %)

Both above percent are on the same reference, fine. Can be added, 100% of 
transactions we required had issues.


after:
  number of transactions skipped: 457 (90.855 %)
  number of transactions above the 1.0 ms latency limit: 46 (100.000 %)

Percents do not add up anymore. I do not think that this is an 
improvement.



I also noticed that if I specify "-f sleep-100.sql" more than once,
the per-script TPS reports are out of line.  This is evidently because
that calculation isn't excluding skipped xacts; but if we're going to
define tps as excluding skipped xacts, surely we should do so there too.


I do not think that we should exclude skipped xacts.


I'm also still exceedingly unhappy about the NaN business.
You have this comment in printSimpleStats:
/* print NaN if no transactions where executed */
but I find that unduly optimistic.  It should really read more like
"if no transactions were executed, at best we'll get some platform-
dependent spelling of NaN.  At worst we'll get a SIGFPE."


Hmmm. Alas you must be right about spelling. There has been no report of 
SIGFPE issue, so I would not bother with that.


I do not think that a pedantic argument about NaN being the "correct" 
answer justifies taking such portability risks, even if I bought that 
argument, which I don't particularly.


ISTM that the portability risk, i.e. about SIGFPE issue, would require a 
non IEEE 754 conforming box, and there has been no complaint yet about 
that.



It's also inconsistent with this basic decision in printResults:

/* Remaining stats are nonsensical if we failed to execute any xacts */
if (total->cnt <= 0)
return;


According to praise, this decision was by Tom Lane. It is about the whole 
execution not executing anything, which may be seen as a special error 
case, whereas a progress report without xact execution within some period 
can be perfectly normal.



If we thought NaN was the "correct" answer for no xacts then we'd just
bull ahead and print NaNs all over the place.


Why not, but for me the issue wrt to the final report is distinct.

I think we would be best advised to deal with this by printing zeroes in 
the progress-report case, and by skipping the output altogether in 
printSimpleStats.  (Or we could make printSimpleStats print zeroes; I'm 
indifferent to that choice.)


I disagree with printing a zero if the measure is undefined. Zero does not 
mean undefined. "Zero means zero":-)


Maybe we could replace platform-specific printing for "NaN" with something 
else, either a deterministic "NaN" or some other string, eg "-" or "?" or 
whatever. The benefit with sticking to some "NaN" or some platform 
specific NaN is that processing the output can retrieve it as a float.


So basically I would have stayed with the current version which seems 
consistent to me, but I agree that it is debatable. I'm not sure it is 
worth being "exceedingly unhappy" about it, because these are special 
corner cases anyway, not likely to be seen without somehow looking for it 
or having pretty strange pgbench runs.



Attached is an updated patch addressing these points.  Comments?


I somehow disagree with your points above, for various reasons.

However, you are a committer, and I think best that the bug is fixed even 
if I happen to disagree with unrelated changes. As an academic, I probably 
have a number of silly ideas about a lot of things. This is part of the 
job description:-)


--
Fabien.



Re: [HACKERS] pgbench regression test failure

2017-11-14 Thread Fabien COELHO


Hello Steve,


Here is the hopefully right version, which passes tests here.


This version seems fine.

I think it is ready for a committer


Ok, thanks for the debug!

--
Fabien.



Re: [HACKERS] pgbench regression test failure

2017-11-14 Thread Steve Singer

On Mon, 13 Nov 2017, Fabien COELHO wrote:



Hello Steve,

   printf("number of transactions actually processed: " 
INT64_FORMAT "/%d\n",

-  total->cnt - total->skipped, nxacts * nclients);
+  total->cnt, nxacts * nclients);

I think you want ntx instead of total->cnt here.


Indeed... and this is also what my git branch contains... I just sent the 
wrong version, sorry:-( The same fix is also needed in the else branch.


Here is the hopefully right version, which passes tests here.


This version seems fine.

I think it is ready for a committer




--
Fabien.





Re: [HACKERS] pgbench regression test failure

2017-11-13 Thread Fabien COELHO


Hello Steve,


   printf("number of transactions actually processed: " INT64_FORMAT 
"/%d\n",
-  total->cnt - total->skipped, nxacts * nclients);
+  total->cnt, nxacts * nclients);

I think you want ntx instead of total->cnt here.


Indeed... and this is also what my git branch contains... I just sent the 
wrong version, sorry:-( The same fix is also needed in the else branch.


Here is the hopefully right version, which passes tests here.

--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index ec56a74..1a5c04c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2613,7 +2613,7 @@ processXactStats(TState *thread, CState *st, instr_time *now,
 		doLog(thread, st, agg, skipped, latency, lag);
 
 	/* XXX could use a mutex here, but we choose not to */
-	if (per_script_stats)
+	if (per_script_stats || latency_limit)
 		accumStats(&sql_script[st->use_file].stats, skipped, latency, lag);
 }
 
@@ -3556,11 +3556,14 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	double		time_include,
 tps_include,
 tps_exclude;
+	int64		ntx = total->cnt - total->skipped;
 
 	time_include = INSTR_TIME_GET_DOUBLE(total_time);
-	tps_include = total->cnt / time_include;
-	tps_exclude = total->cnt / (time_include -
-(INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
+
+	/* tps is about actually executed transactions */
+	tps_include = ntx / time_include;
+	tps_exclude = ntx /
+		(time_include - (INSTR_TIME_GET_DOUBLE(conn_total_time) / nclients));
 
 	/* Report test parameters. */
 	printf("transaction type: %s\n",
@@ -3573,13 +3576,13 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	{
 		printf("number of transactions per client: %d\n", nxacts);
 		printf("number of transactions actually processed: " INT64_FORMAT "/%d\n",
-			   total->cnt - total->skipped, nxacts * nclients);
+			   ntx, nxacts * nclients);
 	}
 	else
 	{
 		printf("duration: %d s\n", duration);
 		printf("number of transactions actually processed: " INT64_FORMAT "\n",
-			   total->cnt);
+			   ntx);
 	}
 
 	/* Remaining stats are nonsensical if we failed to execute any xacts */
@@ -4706,7 +4709,8 @@ threadRun(void *arg)
 			{
 /* generate and show report */
 StatsData	cur;
-int64		run = now - last_report;
+int64		run = now - last_report,
+			ntx;
 double		tps,
 			total_run,
 			latency,
@@ -4721,7 +4725,7 @@ threadRun(void *arg)
  * XXX: No locking. There is no guarantee that we get an
  * atomic snapshot of the transaction count and latencies, so
  * these figures can well be off by a small amount. The
- * progress is report's purpose is to give a quick overview of
+ * progress report's purpose is to give a quick overview of
  * how the test is going, so that shouldn't matter too much.
  * (If a read from a 64-bit integer is not atomic, you might
  * get a "torn" read and completely bogus latencies though!)
@@ -4735,15 +4739,14 @@ threadRun(void *arg)
 	cur.skipped += thread[i].stats.skipped;
 }
 
+/* we count only actually executed transactions */
+ntx = (cur.cnt - cur.skipped) - (last.cnt - last.skipped);
 total_run = (now - thread_start) / 100.0;
-tps = 100.0 * (cur.cnt - last.cnt) / run;
-latency = 0.001 * (cur.latency.sum - last.latency.sum) /
-	(cur.cnt - last.cnt);
-sqlat = 1.0 * (cur.latency.sum2 - last.latency.sum2)
-	/ (cur.cnt - last.cnt);
+tps = 100.0 * ntx / run;
+latency = 0.001 * (cur.latency.sum - last.latency.sum) / ntx;
+sqlat = 1.0 * (cur.latency.sum2 - last.latency.sum2) / ntx;
 stdev = 0.001 * sqrt(sqlat - 100.0 * latency * latency);
-lag = 0.001 * (cur.lag.sum - last.lag.sum) /
-	(cur.cnt - last.cnt);
+lag = 0.001 * (cur.lag.sum - last.lag.sum) / ntx;
 
 if (progress_timestamp)
 {
@@ -4760,6 +4763,7 @@ threadRun(void *arg)
 			 (long) tv.tv_sec, (long) (tv.tv_usec / 1000));
 }
 else
+	/* round seconds are expected, nut the thread may be late */
 	snprintf(tbuf, sizeof(tbuf), "%.1f s", total_run);
 
 fprintf(stderr,