Re: [PATCHES] New features for pgbench
On 2/12/2007 11:43 AM, Tom Lane wrote: Greg Smith [EMAIL PROTECTED] writes: Right now when you run pgbench, the results vary considerably from run to run even if you completely rebuild the database every time. I've found that a lot of that variation comes from two things: This is a real issue, but I think your proposed patch does not fix it. A pgbench run will still be penalized according to the number of checkpoints or autovacuums that happen while it occurs. Guaranteeing that there's at least one is maybe a bit more fair than allowing the possibility of having none, but it's hardly a complete fix. Also, this approach means that short test runs will have artificially lower TPS results than longer ones, because the fixed part of the maintenance overhead is amortized over fewer transactions. Anything that doesn't run exclusively on the server, is given enough data in size and enough time to similarly populate the buffer cache for each run, WILL report more or less random TPS results. Real benchmarks on considerable sized hardware have ramp-up times that are measured in hours if not days, with the sole purpose of populating the cache and thus smoothing out the transaction response profile. I think this change is an entirely misleading approach to tackle the problem at hand. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] New features for pgbench
On Tue, Feb 13, 2007 at 01:08:04AM -0500, Greg Smith wrote: On Tue, 13 Feb 2007, Takayuki Tsunakawa wrote: The Win32 APIs that pgbench is using for gettimeofday() (in src/port/gettimeofday.c) is much lower in resolution than Linux. I wasn't aware of this issue, and it certainly makes the whole latency side of pgbench pretty useless on Win32. There is code in src/include/executor/instrument.h that uses a higher resolution Windows timer API than gettimeofday() does (as you point out, that one is only resolves to one Windows tick, about 15ms). If I can get a Windows build environment setup, I'll see if I can borrow that solution for pgbench. As long as you only need to measure time *difference*, those are pretty easy to use. Different from Unix, but easy. If you need to keep a counter that contains actual time it can still be done, but it's a bit more tricky (not really hard, though). //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] New features for pgbench
Greg Smith [EMAIL PROTECTED] writes: Right now when you run pgbench, the results vary considerably from run to run even if you completely rebuild the database every time. I've found that a lot of that variation comes from two things: This is a real issue, but I think your proposed patch does not fix it. A pgbench run will still be penalized according to the number of checkpoints or autovacuums that happen while it occurs. Guaranteeing that there's at least one is maybe a bit more fair than allowing the possibility of having none, but it's hardly a complete fix. Also, this approach means that short test runs will have artificially lower TPS results than longer ones, because the fixed part of the maintenance overhead is amortized over fewer transactions. I believe it's a feature, not a bug, that Postgres shoves a lot of maintenance out of the main transaction pathways and into background tasks. That allows us to deal with higher peak transaction rates than we otherwise could do. Maybe the right way to think about approaching this issue is to try to estimate a peak TPS (what we can achieve when no maintenance processing is happening) and a long-term average TPS (net throughput allowing for maintenance processing). I don't have a specific suggestion about how to modify pgbench to account for this, but I do think we need something more than a single TPS number if we want to describe the system behavior well. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] New features for pgbench
On Mon, 12 Feb 2007, Tom Lane wrote: This is a real issue, but I think your proposed patch does not fix it. I certainly wouldn't claim that my patch _fixes_ the problem in the general case; it provides one way to measure it. Currently it's not obvious to new pgbench users that the problem even exists at all. I feel it's important to draw attention to the fact that it's something you should be aware of, even if an automatic resolution to the problem isn't obvious yet. In the context I run pgbench in, it is also a workable fix. I don't even pay attention to pgbench results unless I'm popping 10,000 (desktop) to 100,000 (server) transactions through it. In that context, I believe it fairly penalizes the transactions for the data they leave behind for maintenance. I completely agree that people doing short runs shouldn't use this switch. Anyway, I like your idea of describing the lower TPS number as including maintenance, that matches the terminology used within the documentation better. I will reformat the output to use that term. Here's what I'm gonna do. The patch I submitted was prepared with the goal of possibly being implemented in 8.2. I thought a change to contrib/ that added a feature turned off by default might have a shot at a backport, and I wanted something people could use on the current release to be available. Now that I know it's never going into an offical 8.2, I will prepare a slightly different patch aimed at 8.3--incorporating all the feedback I've gotten here as either code changes or additional documentation--and resubmit in another week or so. Thanks for the feedback. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] New features for pgbench
Hi, Right now when you run pgbench, the results vary considerably from run to run even if you completely rebuild the database every time. I've found that a lot of that variation comes from two things: The main purpose of pgbench runs is an apples to apples comparison of 2 source bases. One pristine Postgresql source base and another base being the same source patched with supposed enhancements. As long as we use the same postgresql.conf, same hardware environment and exactly same parameter pgbench runs, the difference in the TPS values observed between the 2 sources should be a good enough indicator as to the viability of the new code, dont you think? E.g. autovacuum will trigger on certain tables only if the threshold is over the limit. So that gets tied in to the update rate. The shared_buffers will become a bottleneck only if the code and the run is I/O intensive enough etc. IMHO, as long as the same environment holds true for both the source base runs, we should not see unexplained variations as per the reasons you have mentioned in the observed TPS values. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [PATCHES] New features for pgbench
NikhilS [EMAIL PROTECTED] writes: As long as we use the same postgresql.conf, same hardware environment and exactly same parameter pgbench runs, the difference in the TPS values observed between the 2 sources should be a good enough indicator as to the viability of the new code, dont you think? pgbench has a long-standing, thoroughly earned reputation for producing unrepeatable results. While I agree that we shouldn't whack it around without good cause, there's definitely some problems there, and I think Greg is on to at least one of 'em. The question is what's the best way to fix it ... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] New features for pgbench
The attached adds two new command line switches to pgbench: -x: Generate extended detail in the latency log, including a timestamp for each transaction -X: Do extra cleanup after the run (vacuum on all tables, checkpoint) before stopping the clock. This gives substantially more consistancy in results between runs. Most pgbench results I see people present are so short that they're skewed considerably by whether there was a checkpoint in the middle of the run. This also allows testing situations with various autovacuum settings fairly. There's an update to the README describing the features, as well as correcting/extending some of the existing documentation. I generated the patch from the 8.2.3 release. Since pgbench runs the same way unless you pass it one of the new flags, I was hoping this would be considered for the next 8.2 update. I have a series of additional scripts I'll be releasing shortly that do interesting analysis of this extended latency data from pgbench (graphs of TPS and latency, that sort of thing), and I'd hate for that to only be available on 8.3. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MDIndex: contrib/pgbench/README.pgbench === RCS file: /var/lib/pgsql/cvs/postgresql/contrib/pgbench/README.pgbench,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 README.pgbench --- contrib/pgbench/README.pgbench 11 Feb 2007 23:57:37 - 1.1.1.1 +++ contrib/pgbench/README.pgbench 12 Feb 2007 00:07:35 - @@ -94,10 +94,14 @@ default is 1. NOTE: scaling factor should be at least as large as the largest number of clients you intend to test; else you'll mostly be measuring update contention. + Regular (not initializing) runs using one of the + built-in tests will detect scale based on the number of + branches in the database. For custom (-f) runs this can + be manually specified. -D varname=value - Define a variable. It can be refereed to by a script - provided by using -f option. Multile -D options are allowed. + Define a variable. It can be refered to by a script + provided by using -f option. Multiple -D options are allowed. -U login Specify db user's login name if it is different from @@ -143,6 +147,24 @@ where time is measured in microseconds. + -x + If a logfile is being generated with -l, use an + extended format for the log that is: + + client_id transaction_no time file_no time-epoch time-us + + where time is measured in microseconds, the file_no is + which test file was used (useful when multiple were + specified with -f), and time-epoch/time-us are a + UNIX epoch format timestamp followed by an offset + in microseconds (suitable for creating a ISO 8601 + timestamp with a fraction of a second) of when + the transaction completed. + + -X Perform extra cleanup after the test has completed + (vacuums and a checkpoint) to make test results + more uniform + -d debug option. Index: contrib/pgbench/pgbench.c === RCS file: /var/lib/pgsql/cvs/postgresql/contrib/pgbench/pgbench.c,v retrieving revision 1.1.1.1 diff -u -r1.1.1.1 pgbench.c --- contrib/pgbench/pgbench.c 11 Feb 2007 23:57:37 - 1.1.1.1 +++ contrib/pgbench/pgbench.c 12 Feb 2007 00:07:28 - @@ -76,6 +76,10 @@ bool use_log;/* log transaction latencies to a file */ +bool extended_log; /* provide extended detailed in latency log */ + +bool extra_cleanup; /* do database cleanup after test is over */ + intremains;/* number of remaining clients */ intis_connect; /* establish connection for each transaction */ @@ -455,7 +459,13 @@ diff = (int) (now.tv_sec - st-txn_begin.tv_sec) * 100.0 + (int) (now.tv_usec - st-txn_begin.tv_usec); - fprintf(LOGFILE, %d %d %.0f\n, st-id, st-cnt, diff); + if (extended_log) + fprintf(LOGFILE, %d %d %.0f %d %ld %ld\n, + st-id, st-cnt, diff, + st-use_file, now.tv_sec,now.tv_usec); + else + fprintf(LOGFILE, %d %d %.0f\n, + st-id, st-cnt, diff); } if
Re: [PATCHES] New features for pgbench
On Sun, 2007-02-11 at 20:32 -0500, Greg Smith wrote: The attached adds two new command line switches to pgbench: FYI, context diffs are preferred. -x: Generate extended detail in the latency log, including a timestamp for each transaction I wonder if it's worth just making this the default. I generated the patch from the 8.2.3 release. Since pgbench runs the same way unless you pass it one of the new flags, I was hoping this would be considered for the next 8.2 update. Feature additions are usually severely frowned up in stable release branches, but the standard for contrib/ changes is lower, and as you say, there is no change in behavior if the options aren't used. I'm okay with backporting it: if no one else objects, I'll apply this in a few days. -Neil ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] New features for pgbench
Neil Conway [EMAIL PROTECTED] writes: On Sun, 2007-02-11 at 20:32 -0500, Greg Smith wrote: -x: Generate extended detail in the latency log, including a timestamp for each transaction I wonder if it's worth just making this the default. Does this have any impact on the reported results (by slowing pg_bench itself)? If not, then doing it always would be OK, but I'm not convinced about that ... Feature additions are usually severely frowned up in stable release branches, but the standard for contrib/ changes is lower, No, it isn't. This is *not* a candidate for back-porting. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] New features for pgbench
On Sun, 2007-02-11 at 23:12 -0500, Tom Lane wrote: No, it isn't. This is *not* a candidate for back-porting. Why is that? It seems to me that the potential downside is essentially zero. This is a developer-oriented benchmark tool, after all. -Neil ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] New features for pgbench
On Sun, 11 Feb 2007, Tom Lane wrote: Does this have any impact on the reported results (by slowing pg_bench itself)? I didn't put more code than I had to in the transaction path, to avoid any slowdown. I didn't convert the timestamp to human readable format or anything intensive like that to avoid impacting the pgbench results. It's just dumping some data that was already sitting there. There is an extra if statement for each transaction, and a slightly longer fprintf when running with the extra latency output in place. That's it. The file gets %d %d %.0f %d %ld %ld\n instead of %d %d %.0f\n The main drawback to logging more as the default is about twice as much disk I/O for writing the latency log out. That's small change compared with the WAL/database writes that must be going on to generate that transaction, and I sure haven't been able to measure any change in results. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] New features for pgbench
Hi, On 2/12/07, Greg Smith [EMAIL PROTECTED] wrote: The attached adds two new command line switches to pgbench: -x: Generate extended detail in the latency log, including a timestamp for each transaction From your patch I see that it augments the -l flag. IMHO it does not make sense to add another flag. We can save the if check and log the extended contents as part of -l itself. -X: Do extra cleanup after the run (vacuum on all tables, checkpoint) before stopping the clock. This gives substantially more consistancy in results between runs. I am sorry, but I do not understand the above. If I read it correctly, are you suggesting that the same database with a prior pgbench run be used for further pgbench runs? How is it useful? How can one guarantee consistency of observed tps values with this in place? Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com