Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Fabien COELHO
Hello, The counters are updated when the transaction is finished anyway? Yes, but the thread does not know it's time to write the results until it completes the first transaction after the interval ends ... Let's say the very first query in thread #1 takes a minute for some reason, while

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Tomas Vondra
Hi, On 05/02/15 15:30, Fabien COELHO wrote: Hello, The counters are updated when the transaction is finished anyway? Yes, but the thread does not know it's time to write the results until it completes the first transaction after the interval ends ... Let's say the very first query in

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-05-02 Thread Fabien COELHO
It might be done in the other direction, though - the writer thread might collect current results at the end of the interval. Yep, you can indeed accumulate per thread and sum on the end of the interval, but a lock is still needed if you want exact figures. ISTM that it is what is done for

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-29 Thread Fabien COELHO
Hello Tomas, The results are as follow: * 1 thread 33 runs median tps (average is consistent): - no logging:22062 - separate logging: 19360 (-12.2%) - merged logging:19326 (-12.4%, not significant from previous) Interesting. What hardware is this? Dell PowerEdge R720

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-29 Thread Fabien COELHO
I also implemented a quick and dirty version for a merge log based on sharing a file handle (append mode + sprintf + fputs). I tried the append + per-thread 2KB buffered sprintf + fputs when full, with the same number of runs. The logs are out of order by chunks, the overhead seems higher

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-28 Thread Tomas Vondra
On 28.3.2015 11:21, Fabien COELHO wrote: Hello Tomas, I do agree that fprintf is not cheap, actually when profiling pgbench it's often the #1 item, but the impact on the measurements is actually quite small. For example with a small database (scale 10) and read-only 30-second runs (single

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-28 Thread Fabien COELHO
Hello Tomas, I do agree that fprintf is not cheap, actually when profiling pgbench it's often the #1 item, but the impact on the measurements is actually quite small. For example with a small database (scale 10) and read-only 30-second runs (single client), I get this: no logging: 18672

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-23 Thread didier
Hi, On Sat, Mar 21, 2015 at 8:42 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: Hello Didier, If fprintf takes p = 0.025 (1/40) of the time, then with 2 threads the collision probability would be about 1/40 and the delayed thread would be waiting for half this time on average, so the

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-23 Thread Andres Freund
On March 23, 2015 8:00:04 PM GMT+01:00, Fabien COELHO coe...@cri.ensmp.fr wrote: Guys, I don't see this theoretical discussion going anywhere. I think it's time to simply implement this and evaluate it on a bigger machine. Sure. I was kind of hoping that someone else would implement it,

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-23 Thread Andres Freund
Hi, Guys, I don't see this theoretical discussion going anywhere. I think it's time to simply implement this and evaluate it on a bigger machine. It can't take very long to implement tosimply just write to one file instead of the multiple files as now. The posix guaranteed fprintf locking should

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-23 Thread Fabien COELHO
Hello, Yes but for a third thread (each on a physical core) it will be 1/40 + 1/40 and so on up to roughly 40/40 for 40 cores. That is why I proposed a formula which depends on the number of threads. [...] But they aren't constant only close. It may or not show up in this case but I've

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-23 Thread Fabien COELHO
Guys, I don't see this theoretical discussion going anywhere. I think it's time to simply implement this and evaluate it on a bigger machine. Sure. I was kind of hoping that someone else would implement it, because I'm a reviewer on this one, and I do not have the bigger machine at hand

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Andres Freund
On 2015-03-17 11:50:28 -0400, Robert Haas wrote: On Tue, Mar 17, 2015 at 11:27 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: The fprintf we are talking about occurs at most once per pgbench transaction, possibly much less when aggregation is activated, and this transaction involves networks

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Fabien COELHO
Hello Tomas, My point is that if there are many threads and tremendous TPS, the *detailed* per-transaction log (aka simple log) is probably a bad choice anyway, and the aggregated version is the way to go. I disagree with this reasoning. Can you provide numbers supporting it? I'm not sure

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Andres Freund
On 2015-03-21 10:37:05 +0100, Fabien COELHO wrote: Hello Tomas, Let us take this as a worst-case figure and try some maths. If fprintf takes p = 0.025 (1/40) of the time, then with 2 threads the collision probability would be about 1/40 and the delayed thread would be waiting for half

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread didier
Hi, On Sat, Mar 21, 2015 at 10:37 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: no logging: 18672 18792 18667 18518 18613 18547 with logging: 18170 18093 18162 18273 18307 18234 So on average, that's 18634 vs. 18206, i.e. less than 2.5% difference. And with more expensive transactions

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Fabien COELHO
Hello Andres, With your worst-case figure and some rounding, it seems to look like: #threadscollision probabilityperformance impact 2 1/401/3200 4 1/7 1/533 8 0.7 0.01 (about

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Fabien COELHO
Hello Didier, If fprintf takes p = 0.025 (1/40) of the time, then with 2 threads the collision probability would be about 1/40 and the delayed thread would be waiting for half this time on average, so the performance impact due to fprintf locking would be negligeable (1/80 delay occured in

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-21 Thread Fabien COELHO
Well, fprintf() doesn't have to acquire the lock for the entirety of it's operation - just for the access to the stream buffer. Yep. If it is implemented by appending stuff to the stream as the format is processed, this would still mean the whole time of its operation. Hence preprocessing

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-20 Thread Tomas Vondra
Hi, On 20.3.2015 13:43, Fabien COELHO wrote: Hello Robert, The fprintf we are talking about occurs at most once per pgbench transaction, possibly much less when aggregation is activated, and this transaction involves networks exchanges and possibly disk writes on the server. random()

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-20 Thread Fabien COELHO
Hello Robert, The fprintf we are talking about occurs at most once per pgbench transaction, possibly much less when aggregation is activated, and this transaction involves networks exchanges and possibly disk writes on the server. random() was occurring four times per transaction rather than

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-17 Thread Robert Haas
On Sun, Mar 15, 2015 at 3:35 PM, Fabien COELHO coe...@cri.ensmp.fr wrote: Firstly, the fact that pgbench produces one file per thread is awkward. I agree, but I think it is due to the multi process thread emulation: if you have real threads, you can do a simple fprintf, possibly with some

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-17 Thread Robert Haas
On Tue, Mar 17, 2015 at 11:27 AM, Fabien COELHO coe...@cri.ensmp.fr wrote: The fprintf we are talking about occurs at most once per pgbench transaction, possibly much less when aggregation is activated, and this transaction involves networks exchanges and possibly disk writes on the server.

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-17 Thread Fabien COELHO
Hello, I agree, but I think it is due to the multi process thread emulation: if you have real threads, you can do a simple fprintf, possibly with some mutex, and you're done. There is really nothing to do to implement this feature. I think that's probably not a good idea, because fprintf()

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-15 Thread Fabien COELHO
Sun, 15 Mar 2015 11:22:01 +0100 (CET) Hello Tomas, attached is a patch implementing merging of pgbench logs. These logs are written by each thread, so with N threads you get N files with names pgbench_log.PID pgbench_log.PID.1 ... pgbench_log.PID.N Before analyzing these logs,

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-15 Thread Fabien COELHO
Firstly, the fact that pgbench produces one file per thread is awkward. I agree, but I think it is due to the multi process thread emulation: if you have real threads, you can do a simple fprintf, possibly with some mutex, and you're done. There is really nothing to do to implement this

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-15 Thread Tomas Vondra
On 15.3.2015 11:22, Fabien COELHO wrote: I've looked at the patch. Although I think that such a feature is somehow desirable... I have two issues with it: ISTM that (1) it does not belong to pgbench as such (2) even if, the implementation is not right About (1): I think that this

Re: [HACKERS] PATCH: pgbench - merging transaction logs

2015-03-15 Thread Tomas Vondra
On 15.3.2015 20:35, Fabien COELHO wrote: Firstly, the fact that pgbench produces one file per thread is awkward. I agree, but I think it is due to the multi process thread emulation: if you have real threads, you can do a simple fprintf, possibly with some mutex, and you're done. There is

[HACKERS] PATCH: pgbench - merging transaction logs

2015-03-08 Thread Tomas Vondra
Hi there, attached is a patch implementing merging of pgbench logs. These logs are written by each thread, so with N threads you get N files with names pgbench_log.PID pgbench_log.PID.1 ... pgbench_log.PID.N Before analyzing these logs, these files need to be combined. I usually