On 7/11/13 8:29 AM, KONDO Mitsumasa wrote: > I use linear combination method for considering about total checkpoint > schedule > which are write phase and fsync phase. V3 patch was considered about only > fsync > phase, V4 patch was considered about write phase and fsync phase, and v5 patch > was considered about only fsync phase.
Your v5 now looks like my "Self-tuning checkpoint sync spread" series: https://commitfest.postgresql.org/action/patch_view?id=514 which I did after deciding write phase delays didn't help. It looks to me like some, maybe all, of your gain is coming from how any added delays spread out the checkpoints. The "self-tuning" part I aimed at was trying to stay on exactly the same checkpoint end time even with the delays in place. I got that part to work, but the performance gain went away once the schedule was a fair comparison. You are trying to solve a very hard problem. How long are you running your dbt-2 tests for? I didn't see that listed anywhere. > ** Average checkpoint duration (sec) (Not include during loading time) > | write_duration | sync_duration | total > fsync v3-0.7 | 296.6 | 251.8898 | 548.48 | OK > fsync v3-0.9 | 292.086 | 276.4525 | 568.53 | OK > fsync v3-0.7_disabled| 303.5706 | 155.6116 | 459.18 | OK > fsync v4-0.7 | 273.8338 | 355.6224 | 629.45 | OK > fsync v4-0.9 | 329.0522 | 231.77 | 560.82 | OK I graphed the total times against the resulting NOTPM values and attached that. I expect transaction rate to increase along with time time between checkpoints, and that's what I see here. The fsync v4-0.7 result is worse than the rest for some reason, but all the rest line up nicely. Notice how fsync v3-0.7_disabled has the lowest total time between checkpoints, at 459.18. That is why it has the most I/O and therefore runs more slowly than the rest. If you take your fsync v3-0.7_disabled and increase checkpoint_segments and/or checkpoint_timeout until that test is averaging about 550 seconds between checkpoints, NOTPM should also increase. That's interesting to know, but you don't need any change to Postgres for that. That's what always happens when you have less checkpoints per run. If you get a checkpoint time table like this where the total duration is very close--within +/-20 seconds is the sort of noise I would expect there--at that point I would say you have all your patches on the same checkpoint schedule. And then you can compare the NOTPM numbers usefully. When the checkpoint times are in a large range like 459.18 to 629.45 in this table, as my graph shows the associated NOTPM numbers are going to be based on that time. I would recommend taking a snapshot of pg_stat_bgwriter before and after the test runs, and then showing the difference between all of those numbers too. If the test runs for a while--say 30 minutes--the total number of checkpoints should be very close too. > * Test Server > Server: HP Proliant DL360 G7 > CPU: Xeon E5640 2.66GHz (1P/4C) > Memory: 18GB(PC3-10600R-9) > Disk: 146GB(15k)*4 RAID1+0 > RAID controller: P410i/256MB > (Add) Set off energy efficient function in BIOS and OS. Excellent, here I have a DL160 G6 with 2 processors, 72GB of RAM, and that same P410 controller + 4 disks. I've been meaning to get DBT-2 running on there usefully, your research gives me a reason to do that. You seem to be in a rush due to the commitfest schedule. I have some bad news for you there. You're not going to see a change here committed in this CF based on where it's at, so you might as well think about the best longer term plan. I would be shocked if anything came out of this in less than 3 months really. That's the shortest amount of time I've ever done something useful in this area. Each useful benchmarking run takes me about 3 days of computer time, it's not a very fast development cycle. Even if all of your results were great, we'd need to get someone to duplicate them on another server, and we'd need to make sure they didn't make other workloads worse. DBT-2 is very useful, but no one is going to get a major change to the write logic in the database committed based on one benchmark. Past changes like this have used both DBT-2 and a large number of pgbench tests to get enough evidence of improvement to commit. I can help with that part when you get to something I haven't tried already. I am very interesting in improving this area, it just takes a lot of work to do it. -- Greg Smith 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com
<<attachment: NOTPM-Checkpoints.png>>
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers