Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-03-05 Thread KONDO Mitsumasa
Hi, Horiguch's patch does not seem to record minRecoveryPoint in ReadRecord(); Attempt patch records minRecoveryPoint. [crash recovery - record minRecoveryPoint in control file - archive recovery] I think that this is an original intention of Heikki's patch. I also found a bug in latest

Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-03-07 Thread KONDO Mitsumasa
(2013/03/06 16:50), Heikki Linnakangas wrote: Hi, Horiguch's patch does not seem to record minRecoveryPoint in ReadRecord(); Attempt patch records minRecoveryPoint. [crash recovery - record minRecoveryPoint in control file - archive recovery] I think that this is an original intention of

Re: [HACKERS] 9.2.3 crashes during archive recovery

2013-03-07 Thread KONDO Mitsumasa
(2013/03/07 19:41), Heikki Linnakangas wrote: On 07.03.2013 10:05, KONDO Mitsumasa wrote: (2013/03/06 16:50), Heikki Linnakangas wrote: Yeah. That fix isn't right, though; XLogPageRead() is supposed to return true on success, and false on error, and the patch makes it return 'true' on error

Re: [HACKERS] Failing start-up archive recovery at Standby mode in PG9.2.4

2013-04-24 Thread KONDO Mitsumasa
Hi, I find problem about failing start-up achive recovery at Standby mode in PG9.2.4 streaming replication. I test same problem in PG9.2.3. But it is not occerd... cp: cannot stat `../arc/00030013': そのようなファイルやディレクトリはありません [Standby] 2013-04-22 01:27:25 EDTLOG: 0:

Re: [HACKERS] Failing start-up archive recovery at Standby mode in PG9.2.4

2013-04-26 Thread KONDO Mitsumasa
Hi, I discavered the problem cause. I think taht horiguchi's discovery is another problem... Problem has CreateRestartPoint. In recovery mode, PG should not WAL record. Because PG does not know latest WAL file location. But in this problem case, PG(standby) write WAL file at RestartPoint in

[HACKERS] Archive Recovery and SR promote command is failed by “contrecord is requested” in ReadRecord()

2013-05-01 Thread KONDO Mitsumasa
Hi, I found that archive recovery and SR promote command is failed by contrecord is requested by 0/420 in ReadRecord(). I investigate about contrecord, it means that record crosses page boundary. I think it is not irregular page, and should be try to read next page in this case. But in

Re: [HACKERS] Failing start-up archive recovery at Standby mode in PG9.2.4

2013-05-07 Thread KONDO Mitsumasa
(2013/05/07 22:40), Heikki Linnakangas wrote: On 26.04.2013 11:51, KONDO Mitsumasa wrote: So I fix CreateRestartPoint at branching point of executing MinRecoveryPoint. It seems to fix this problem, but attached patch is plain. I didn't understand this. I committed a fix for the issue where

Re: [HACKERS] 2nd Level Buffer Cache

2011-03-22 Thread KONDO Mitsumasa
://mysql.lamphost.net/sources/doxygen/mysql-5.1/structPgman_1_1Page__entry.html - Song Jiang HP: http://www.ece.eng.wayne.edu/~sjiang/ -- Kondo Mitsumasa NTT Corporation, NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription

[HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-10 Thread KONDO Mitsumasa
Hi, I create patch which is improvement of checkpoint IO scheduler for stable transaction responses. * Problem in checkpoint IO schedule in heavy transaction case When heavy transaction in database, I think PostgreSQL checkpoint scheduler has two problems at start and end of checkpoint.

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-14 Thread KONDO Mitsumasa
(2013/06/12 23:07), Robert Haas wrote: On Mon, Jun 10, 2013 at 3:48 PM, Simon Riggs si...@2ndquadrant.com wrote: On 10 June 2013 11:51, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I create patch which is improvement of checkpoint IO scheduler for stable transaction responses. Looks

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-17 Thread KONDO Mitsumasa
Thank you for giving comments and my patch reviewer! (2013/06/16 23:27), Heikki Linnakangas wrote: On 10.06.2013 13:51, KONDO Mitsumasa wrote: I create patch which is improvement of checkpoint IO scheduler for stable transaction responses. * Problem in checkpoint IO schedule in heavy

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-17 Thread KONDO Mitsumasa
(2013/06/17 5:48), Andres Freund wrote: On 2013-06-16 17:27:56 +0300, Heikki Linnakangas wrote: If we don't mind scanning the buffer cache several times, we don't necessarily even need to sort the writes for that. Just scan the buffer cache for all buffers belonging to relation A, then fsync

Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)

2013-06-21 Thread KONDO Mitsumasa
Hi Febien, I send you my review result and refactoring patch. I think that your patch has good function and many people surely want to use! I hope that my review comment will be good for your patch. * 1. Complete words and variable in source code and sgml document. It is readable for user

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-21 Thread KONDO Mitsumasa
Hi, I took results of my separate patches and original PG. * Result of DBT-2 | TPS 90%tileAverage Maximum -- original_0.7 | 3474.62 18.348328 5.73936.977713 original_1.0 | 3469.03 18.637865 5.84241.754421

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-26 Thread KONDO Mitsumasa
Thank you for comments! On Tue, Jun 25, 2013 at 1:15 PM, Heikki Linnakangas Hmm, so the write patch doesn't do much, but the fsync patch makes the response times somewhat smoother. I'd suggest that we drop the write patch for now, and focus on the fsyncs. Write patch is effective in TPS! I

Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)

2013-06-26 Thread KONDO Mitsumasa
Hello Fevien, Thank you for your fast work and reply. I try to test your new patch until next week. (2013/06/26 20:16), Fabien COELHO wrote: Here is a v4 that takes into account most of your points: The report is performed for all threads by thread 0, however --progress is not supported

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-26 Thread KONDO Mitsumasa
(2013/06/26 20:15), Heikki Linnakangas wrote: On 26.06.2013 11:37, KONDO Mitsumasa wrote: On Tue, Jun 25, 2013 at 1:15 PM, Heikki Linnakangas Hmm, so the write patch doesn't do much, but the fsync patch makes the response times somewhat smoother. I'd suggest that we drop the write patch

Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)

2013-06-27 Thread KONDO Mitsumasa
Dear Febien (2013/06/27 14:39), Fabien COELHO wrote: If I show a latency at full load, that would be nclients/tps, not 1/tps. However, I'm hoping to pass the throttling patch to pgbench, in which case the latency to show is a little bit different because the nclients/tps would include sleep

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-28 Thread KONDO Mitsumasa
(2013/06/28 0:08), Robert Haas wrote: On Tue, Jun 25, 2013 at 4:28 PM, Heikki Linnakangas hlinnakan...@vmware.com wrote: I'm pretty sure Greg Smith tried it the fixed-sleep thing before and it didn't work that well. I have also tried it and the resulting behavior was unimpressive. It makes

Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)

2013-07-01 Thread KONDO Mitsumasa
(2013/06/28 3:17), Fabien COELHO wrote: Attached is patch version 5. It includes this solution for fork emulation, one report per thread instead of a global report. Some code duplication for that. It's good coding. I test configure option with --disable-thread-safety and not. My test results

Re: [HACKERS] [PATCH] add --progress option to pgbench (submission 3)

2013-07-01 Thread KONDO Mitsumasa
Hi, Febien Thanks for your fast response and fix! I set your patch ready for commiter now. (2013/07/01 19:49), Fabien COELHO wrote: I have small comments. I think that 'lat' is not generally abbreviation of 'latency'. But I don't know good abbreviation. If you have any good abbreviation,

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-03 Thread KONDO Mitsumasa
Hi, I tested and changed segsize=0.25GB which is max partitioned table file size and default setting is 1GB in configure option (./configure --with-segsize=0.25). Because I thought that small segsize is good for fsync phase and background disk write in OS in checkpoint. I got significant

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-04 Thread KONDO Mitsumasa
(2013/07/03 22:31), Robert Haas wrote: On Wed, Jul 3, 2013 at 4:18 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I tested and changed segsize=0.25GB which is max partitioned table file size and default setting is 1GB in configure option (./configure --with-segsize=0.25). Because I

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-05 Thread KONDO Mitsumasa
(2013/07/05 0:35), Joshua D. Drake wrote: On 07/04/2013 06:05 AM, Andres Freund wrote: Presumably the smaller segsize is better because we don't completely stall the system by submitting up to 1GB of io at once. So, if we were to do it in 32MB chunks and then do a final fsync() afterwards we

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-08 Thread KONDO Mitsumasa
I create fsync v2 patch. There's not much time, so I try to focus fsync patch in this commit festa as adviced by Heikki. And I'm sorry that it is not good that diverging from main discussion in this commit festa... Of course, I continue to try another improvement. * Changes - Add ckpt_flag

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-11 Thread KONDO Mitsumasa
Hi,l I create fsync v3 v4 v5 patches and test them. * Changes - Add considering about total checkpoint schedule in fsync phase (v3 v4 v5) - Add considering about total checkpoint schedule in write phase (v4 only) - Modify some implementations from v3 (v5 only) I use linear combination

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-19 Thread KONDO Mitsumasa
(2013/07/19 0:41), Greg Smith wrote: On 7/18/13 11:04 AM, Robert Haas wrote: On a system where fsync is sometimes very very slow, that might result in the checkpoint overrunning its time budget - but SO WHAT? Checkpoints provide a boundary on recovery time. That is their only purpose. You

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-22 Thread KONDO Mitsumasa
(2013/07/19 22:48), Greg Smith wrote: On 7/19/13 3:53 AM, KONDO Mitsumasa wrote: Recently, a user who think system availability is important uses synchronous replication cluster. If your argument for why it's OK to ignore bounding crash recovery on the master is that it's possible to failover

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-22 Thread KONDO Mitsumasa
(2013/07/21 4:37), Heikki Linnakangas wrote: Mitsumasa-san, since you have the test rig ready, could you try the attached patch please? It scans the buffer cache several times, writing out all the dirty buffers for segment A first, then fsyncs it, then all dirty buffers for segment B, and so on.

Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-07-25 Thread KONDO Mitsumasa
Hi, I understand why my patch is faster than original, by executing Heikki's patch. His patch execute write() and fsync() in each relation files in write-phase in checkpoint. Therefore, I expected that write-phase would be slow, and fsync-phase would be fast. Because disk-write had executed

Re: [HACKERS] Design proposal: fsync absorb linear slider

2013-07-29 Thread KONDO Mitsumasa
(2013/07/24 1:13), Greg Smith wrote: On 7/23/13 10:56 AM, Robert Haas wrote: On Mon, Jul 22, 2013 at 11:48 PM, Greg Smith g...@2ndquadrant.com wrote: We know that a 1GB relation segment can take a really long time to write out. That could include up to 128 changed 8K pages, and we allow all

Re: [HACKERS] inconsistent state after crash recovery

2013-07-29 Thread KONDO Mitsumasa
Hi Satoshi, I was wondering about this problem. Please tell us about your system enviroment which is postgresql version ,OS, raid card, and file system. Best regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files

2013-08-05 Thread KONDO Mitsumasa
Hi Amit, (2013/08/05 15:23), Amit Langote wrote: May the routines in fd.c become bottleneck with a large number of concurrent connections to above database, say something like pgbench -j 8 -c 128? Is there any other place I should be paying attention to? What kind of file system did you use?

Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files

2013-08-05 Thread KONDO Mitsumasa
(2013/08/05 17:14), Amit Langote wrote: So, within the limits of max_files_per_process, the routines of file.c should not become a bottleneck? It may not become bottleneck. 1 FD consumes 160 byte in 64bit system. See linux manual at epoll. Regards, -- Mitsumasa KONDO NTT Open Source Software

Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files

2013-08-06 Thread KONDO Mitsumasa
(2013/08/05 19:28), Andres Freund wrote: On 2013-08-05 18:40:10 +0900, KONDO Mitsumasa wrote: (2013/08/05 17:14), Amit Langote wrote: So, within the limits of max_files_per_process, the routines of file.c should not become a bottleneck? It may not become bottleneck. 1 FD consumes 160 byte

Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files

2013-08-06 Thread KONDO Mitsumasa
(2013/08/05 21:23), Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: ... Also, there are global limits to the amount of filehandles that can simultaneously opened on a system. Yeah. Raising max_files_per_process puts you at serious risk that everything else on the box will

Re: [HACKERS] [GENERAL] Bottlenecks with large number of relation segment files

2013-08-06 Thread KONDO Mitsumasa
(2013/08/06 19:33), Andres Freund wrote: On 2013-08-06 19:19:41 +0900, KONDO Mitsumasa wrote: (2013/08/05 21:23), Tom Lane wrote: Andres Freund and...@2ndquadrant.com writes: ... Also, there are global limits to the amount of filehandles that can simultaneously opened on a system. Yeah

Re: [HACKERS] Compression of full-page-writes

2013-08-29 Thread KONDO Mitsumasa
(2013/08/30 11:55), Fujii Masao wrote: * Benchmark pgbench -c 32 -j 4 -T 900 -M prepared scaling factor: 100 checkpoint_segments = 1024 checkpoint_timeout = 5min (every checkpoint during benchmark were triggered by checkpoint_timeout) Did you execute munual checkpoint before

[HACKERS] Add pgbench option: CHECKPOINT before starting benchmark

2013-08-30 Thread KONDO Mitsumasa
Hi, I add checkpoint option to pgbench. pgbench is simple and useful benchmark for every user. However, result of benchmark greatly changes by some situations which are in executing checkpoint, number of dirty buffers in share_buffers, and so on. For such a problem, it is custom to carry out a

Re: [HACKERS] 9.4 regression

2013-09-05 Thread KONDO Mitsumasa
(2013/09/05 0:04), Andres Freund wrote: I'd vote for adding zeroing *after* the fallocate() first. +1, with FALLOC_FL_KEEP_SIZE flag. At least, fallocate with FALLOC_FL_KEEP_SIZE flag is faster than nothing in my developing sorted checkpoint. I adopted it to relation file, so I don't know

Re: [HACKERS] gaussian distribution pgbench

2013-09-29 Thread KONDO Mitsumasa
Sorry for my delay reply. Since I have had vacation last week, I replyed from gmail. However, it was stalled post to pgsql-hackers:-( (2013/09/21 6:05), Kevin Grittner wrote: You had accidentally added to the CF In Progress. Oh, I had completely mistook this CF schedule :-) Maybe, Horiguchi-san

Re: [HACKERS] gaussian distribution pgbench

2013-09-29 Thread KONDO Mitsumasa
Sorry for my delay reply. Since I have had vacation last week, I replied from gmail. However, it was stalled post to pgsql-hackers:-( (2013/09/21 7:54), Fabien COELHO wrote: However this pattern induces stronger cache effects which are maybe not too realistic, because neighboring keys in the

Re: [HACKERS] gaussian distribution pgbench

2013-09-29 Thread KONDO Mitsumasa
(2013/09/27 5:29), Peter Eisentraut wrote: This patch no longer applies. I will try to create this patch in next commit fest. If you have nice idea, please send me! Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list

Re: [HACKERS] Compression of full-page-writes

2013-09-29 Thread KONDO Mitsumasa
Hi Fujii-san, (2013/09/30 12:49), Fujii Masao wrote: On second thought, the patch could compress WAL very much because I used pgbench. I will do the same measurement by using another benchmark. If you hope, I can test this patch in DBT-2 benchmark in end of this week. I will use under

Re: [HACKERS] Compression of full-page-writes

2013-09-30 Thread KONDO Mitsumasa
(2013/09/30 13:55), Amit Kapila wrote: On Mon, Sep 30, 2013 at 10:04 AM, Fujii Masao masao.fu...@gmail.com wrote: Yep, please! It's really helpful! OK! I test with single instance and synchronous replication constitution. By the way, you posted patch which is sync_file_range() WAL writing

[HACKERS] Who is pgFoundery administrator?

2013-10-02 Thread KONDO Mitsumasa
Hi, I want to submit new project in pgFoundery project. I submitted new project which is WAL archive copy tool with directIO method in pgFoundery homepage 2 weeks ago, but it does not have approved and responded at all:-( Who is pgFoundery administrator or board member now? I would like to send

Re: [HACKERS] Who is pgFoundery administrator?

2013-10-02 Thread KONDO Mitsumasa
(2013/10/02 17:37), KONDO Mitsumasa wrote: I want to submit new project in pgFoundery project. Our new project was approved yesterday! Thanks very much for pgFoundery crew. Regards, -- Mitsumasa KONDO NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers

Re: [HACKERS] Compression of full-page-writes

2013-10-08 Thread KONDO Mitsumasa
(2013/10/08 17:33), Haribabu kommi wrote: The checkpoint_timeout and checkpoint_segments are increased to make sure no checkpoint happens during the test run. Your setting is easy occurred checkpoint in checkpoint_segments = 256. I don't know number of disks in your test server, in my test

Re: [HACKERS] Compression of full-page-writes

2013-10-08 Thread KONDO Mitsumasa
Hi, I tested dbt-2 benchmark in single instance and synchronous replication. Unfortunately, my benchmark results were not seen many differences... * Test server Server: HP Proliant DL360 G7 CPU:Xeon E5640 2.66GHz (1P/4C) Memory: 18GB(PC3-10600R-9) Disk: 146GB(15k)*4 RAID1+0

Re: [HACKERS] Compression of full-page-writes

2013-10-08 Thread KONDO Mitsumasa
(2013/10/08 20:13), Haribabu kommi wrote: I chosen the sync_commit=off mode because it generates more tps, thus it increases the volume of WAL. I did not think to there. Sorry... I will test with sync_commit=on mode and provide the test results. OK. Thanks! -- Mitsumasa KONDO NTT Open

Re: [HACKERS] Compression of full-page-writes

2013-10-14 Thread KONDO Mitsumasa
(2013/10/13 0:14), Amit Kapila wrote: On Fri, Oct 11, 2013 at 10:36 PM, Andres Freund and...@2ndquadrant.com wrote: But maybe pglz is just not a good fit for this, it really isn't a very good algorithm in this day and aage. +1. This compression algorithm is needed more faster than pglz which is

Re: [HACKERS] Release note fix for timeline item

2013-10-14 Thread KONDO Mitsumasa
Sorry for my reply late... (2013/10/08 23:26), Bruce Momjian wrote: First, I want to apologize for not completing the release notes earlier so that others could review them. I started working on the release notes on Friday, but my unfamiliarity with the process and fear of making a mistake

Re: [HACKERS] Compression of full-page-writes

2013-10-15 Thread KONDO Mitsumasa
(2013/10/15 13:33), Amit Kapila wrote: Snappy is good mainly for un-compressible data, see the link below: http://www.postgresql.org/message-id/CAAZKuFZCOCHsswQM60ioDO_hk12tA7OG3YcJA8v=4yebmoa...@mail.gmail.com This result was gotten in ARM architecture, it is not general CPU. Please see detail

Re: [HACKERS] Compression of full-page-writes

2013-10-15 Thread KONDO Mitsumasa
(2013/10/15 22:01), k...@rice.edu wrote: Google's lz4 is also a very nice algorithm with 33% better compression performance than snappy and 2X the decompression performance in some benchmarks also with a bsd license: https://code.google.com/p/lz4/ If we judge only performance, we will select

[HACKERS] Add min and max execute statement time in pg_stat_statement

2013-10-18 Thread KONDO Mitsumasa
I submit patch adding min and max execute statement time in pg_stat_statement in next CF. pg_stat_statement have execution time, but it is average execution time and does not provide detail information very much. So I add min and max execute statement time in pg_stat_statement columns. Usage is

[HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-10-18 Thread KONDO Mitsumasa
Hi, I submit improvement of pg_stat_statement usage patch in CF3. In pg_stat_statement, I think buffer hit ratio is very important value. However, it is difficult to calculate it, and it need complicated SQL. This patch makes it more simple usage and documentation. -bench=# SELECT query,

Re: [HACKERS] Who is pgFoundery administrator?

2013-10-21 Thread KONDO Mitsumasa
(2013/10/02 18:57), Michael Paquier wrote: kondo.mitsum...@lab.ntt.co.jp wrote: Who is pgFoundery administrator or board member now? I would like to send e-mail them. At least, it does not have information and support page in pgFoundery homepage. Why don't you consider github as a potential

Re: [HACKERS] Compression of full-page-writes

2013-10-21 Thread KONDO Mitsumasa
(2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I think in general also snappy is mostly preferred for it's low CPU usage not for compression, but overall my vote is also for snappy. I think low CPU usage

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-10-21 Thread KONDO Mitsumasa
(2013/10/18 22:21), Andrew Dunstan wrote: If we're going to extend pg_stat_statements, even more than min and max I'd like to see the standard deviation in execution time. OK. I do! I am making some other patches, please wait more! Regards, -- Mitsumasa KONDO NTT Open Source Software Center.;

Re: [HACKERS] Compression of full-page-writes

2013-10-22 Thread KONDO Mitsumasa
(2013/10/22 12:52), Fujii Masao wrote: On Tue, Oct 22, 2013 at 12:47 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Mon, Oct 21, 2013 at 4:40 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/10/19 14:58), Amit Kapila wrote: On Tue, Oct 15, 2013 at 11:41 AM, KONDO Mitsumasa

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-10-22 Thread KONDO Mitsumasa
Hi All, (2013/10/22 22:26), Stephen Frost wrote: * Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote: In our case, what I keep experiencing with tuning queries is that we have like 99% of them running under acceptable threshold and 1% of them taking more and more time. This is usually

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-11-14 Thread KONDO Mitsumasa
(2013/10/21 20:17), KONDO Mitsumasa wrote: (2013/10/18 22:21), Andrew Dunstan wrote: If we're going to extend pg_stat_statements, even more than min and max I'd like to see the standard deviation in execution time. OK. I do! I am making some other patches, please wait more! I add stddev_time

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-11-14 Thread KONDO Mitsumasa
Oh! Sorry... I forgot to attach my latest patch. Regards, -- Mitsumasa KONDO NTT Open Source Software Center diff --git a/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql b/contrib/pg_stat_statements/pg_stat_statements--1.1--1.2.sql new file mode 100644 index 000..929d623 ---

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-14 Thread KONDO Mitsumasa
Hi Claudio, (2013/11/14 22:53), Claudio Freire wrote: On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I create a patch that is improvement of disk-read and OS file caches. It can optimize kernel readahead parameter using buffer access strategy

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-14 Thread KONDO Mitsumasa
(2013/11/15 2:03), Fujii Masao wrote: On Thu, Nov 14, 2013 at 9:09 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: Hi, I create a patch that is improvement of disk-read and OS file caches. It can optimize kernel readahead parameter using buffer access strategy and posix_fadvice

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-11-14 Thread KONDO Mitsumasa
(2013/11/14 7:11), Peter Geoghegan wrote: On Wed, Oct 23, 2013 at 8:52 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Hmm, now if we had portable atomic addition, so that we could spare the spinlock ... And adding a histogram or min/max for something like execution time isn't an approach

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-14 Thread KONDO Mitsumasa
(2013/11/15 11:17), Peter Geoghegan wrote: On Thu, Nov 14, 2013 at 6:18 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I will fix it. Could you tell me your Mac OS version and gcc version? I have only mac book air with Maverick OS(10.9). I have an idea that Mac OSX doesn't have

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-11-14 Thread KONDO Mitsumasa
(2013/11/15 11:31), Peter Geoghegan wrote: On Thu, Nov 14, 2013 at 6:28 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: It is confirmation just to make sure, does this patch mean my patch? I agree with you about not adding another lock implementation. It will becomes overhead. Yes, I

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2013-11-14 Thread KONDO Mitsumasa
(2013/11/15 2:09), Fujii Masao wrote: Agreed. Could you tell me your agreed reason? I am sorry that I suspect you doesn't understand this disccusion enough:-( Regards, -- Mitsumasa KONDO NTT Open Source Software Ceter -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-17 Thread KONDO Mitsumasa
(2013/11/15 13:48), Claudio Freire wrote: On Thu, Nov 14, 2013 at 11:13 PM, KONDO Mitsumasa I use CentOS 6.4 which kernel version is 2.6.32-358.23.2.el6.x86_64 in this test. That's close to the kernel version I was using, so you should see the same effect. OK. You proposed readahead maximum

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-11-17 Thread KONDO Mitsumasa
(2013/11/18 11:25), Claudio Freire wrote: On Sun, Nov 17, 2013 at 11:02 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: However, my patch is on the way and needed to more improvement. I am going to add method of controlling readahead by GUC, for user can freely select readahed

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa
(2013/11/18 20:16), Haribabu kommi wrote: On 18 October 2013 13:35 KONDO Mitsumasa wrote: This patch conflicts pg_stat_statement_min_max_exectime patch which I submitted, and pg_stat_statement_min_max_exectime patch also adds new columns which are min_time and max_time. So I'd like to change

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa
(2013/11/19 3:56), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values. Please read the following thread. I have

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa
(2013/11/19 11:12), KONDO Mitsumasa wrote: (2013/11/19 3:56), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 10:49 AM, Fujii Masao masao.fu...@gmail.com wrote: The same idea was proposed before but not committed because Itagaki thought that pg_stat_statements view should report only raw values

Re: [HACKERS] Improvement of pg_stat_statement usage about buffer hit ratio

2013-11-18 Thread KONDO Mitsumasa
(2013/11/19 12:03), Peter Geoghegan wrote: On Mon, Nov 18, 2013 at 6:12 PM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I confirmed that Itagaki-san and Mr Cerdic disscution. He said that raw values be just simple. However, were his changes just simple? I cannot understand his

Re: [HACKERS] Logging WAL when updating hintbit

2013-11-18 Thread KONDO Mitsumasa
(2013/11/15 19:27), Sawada Masahiko wrote: On Thu, Nov 14, 2013 at 7:51 PM, Florian Weimer fwei...@redhat.com wrote: On 11/14/2013 07:02 AM, Sawada Masahiko wrote: I attached patch adds new wal_level 'all'. Shouldn't this be a separate setting? It's useful for storage which requires

Re: [HACKERS] Time-Delayed Standbys

2013-11-28 Thread KONDO Mitsumasa
Hi Royes, I'm sorry for my late review... I feel potential of your patch in PG replication function, and it might be something useful for all people. I check your patch and have some comment for improvement. I haven't executed detail of unexpected sutuation yet. But I think that under

Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread KONDO Mitsumasa
(2013/11/30 5:34), Fabrízio de Royes Mello wrote: On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp mailto:kondo.mitsum...@lab.ntt.co.jp wrote: * Problem1 Your patch does not code recovery.conf.sample about recovery_time_delay. Please add it. Fixed. OK

Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread KONDO Mitsumasa
(2013/12/04 4:00), Andres Freund wrote: On 2013-12-03 13:46:28 -0500, Robert Haas wrote: On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello fabriziome...@gmail.com wrote: On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com wrote: Hi Fabrizio, looks good to me. I

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-03 Thread KONDO Mitsumasa
(2013/12/04 11:28), Tatsuo Ishii wrote: Magnus Hagander mag...@hagander.net writes: On Tue, Dec 3, 2013 at 11:44 PM, Josh Berkus j...@agliodbs.com wrote: Would certainly be nice. Realistically, getting good automated performace tests will require paying someone like Greg S., Mark or me for 6

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-05 Thread KONDO Mitsumasa
(2013/12/04 16:39), Claudio Freire wrote: On Wed, Dec 4, 2013 at 4:28 AM, Tatsuo Ishii is...@postgresql.org wrote: Can we avoid the Linux kernel problem by simply increasing our shared buffer size, say up to 80% of memory? It will be swap more easier. Is that the case? If the system has not

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-08 Thread KONDO Mitsumasa
(2013/12/05 23:42), Greg Stark wrote: On Thu, Dec 5, 2013 at 8:35 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: Yes. And using something efficiently DirectIO is more difficult than BufferedIO. If we change write() flag with direct IO in PostgreSQL, it will execute hardest ugly

Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa
Hi Fabrízio, I test your v4 patch, and send your review comments. * Fix typo 49 -# commited transactions from the master, specify a recovery time delay. 49 +# committed transactions from the master, specify a recovery time delay. * Fix white space 177 - if (secs = 0 microsecs

Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa
(2013/12/09 19:36), KONDO Mitsumasa wrote: * Problem 1 I read your wittened document. There is PITR has not affected. However, when I run PITR with min_standby_apply_delay=300, it cannot start server. The log is under following. [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start

Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa
(2013/12/09 19:35), Pavel Stehule wrote: 2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp mailto:kondo.mitsum...@lab.ntt.co.jp Hi Fabrízio, I test your v4 patch, and send your review comments. * Fix typo 49 -# commited transactions from the master, specify

Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa
(2013/12/09 20:29), Andres Freund wrote: On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote: Add my comment. We have to consider three situations. 1. PITR 2. replication standby 3. replication standby with restore_command I think this patch cannot delay in 1 situation. Why? I have three

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-09 Thread KONDO Mitsumasa
Hi, I revise this patch and re-run performance test, it can work collectry in Linux and no complile wanings. I add GUC about enable_kernel_readahead option in new version. When this GUC is on(default), it works in POSIX_FADV_NORMAL which is general readahead in OS. And when it is off, it

Re: [HACKERS] Why we are going to have to go DirectIO

2013-12-10 Thread KONDO Mitsumasa
(2013/12/11 10:25), Tom Lane wrote: Jeff Janes jeff.ja...@gmail.com writes: On Tue, Dec 3, 2013 at 11:39 PM, Claudio Freire klaussfre...@gmail.comwrote: Problem is, Postgres relies on a working kernel cache for checkpoints. Checkpoint logic would have to be heavily reworked to account for an

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-10 Thread KONDO Mitsumasa
(2013/12/10 22:55), Claudio Freire wrote: On Tue, Dec 10, 2013 at 5:03 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I revise this patch and re-run performance test, it can work collectry in Linux and no complile wanings. I add GUC about enable_kernel_readahead option in new version

Re: [HACKERS] Time-Delayed Standbys

2013-12-10 Thread KONDO Mitsumasa
(2013/12/10 18:38), Andres Freund wrote: master PITR? What's that? All PITR is based on recovery.conf and thus not really a master? master PITR is PITR with standby_mode = off. It's just recovery from basebackup. They have difference between master PITR and standby that the former will be

Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread KONDO Mitsumasa
(2013/12/12 7:23), Fabrízio de Royes Mello wrote: On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund and...@2ndquadrant.com * hot_standby=off: Makes delay useable with wal_level=archive (and thus a lower WAL volume) * standby_mode=off: Configurations that use tools like pg_standby and

Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread KONDO Mitsumasa
(2013/12/12 18:09), Simon Riggs wrote: On 9 December 2013 10:54, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: (2013/12/09 19:35), Pavel Stehule wrote: 2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp mailto:kondo.mitsum...@lab.ntt.co.jp Hi Fabrízio, I test your

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-12 Thread KONDO Mitsumasa
(2013/12/12 9:30), Claudio Freire wrote: On Wed, Dec 11, 2013 at 3:14 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: enable_readahead=os|fadvise with os = on, fadvise = off Hmm. fadvise is method and is not a purpose. So I consider another idea of this GUC. Yeah, I was thinking

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-17 Thread KONDO Mitsumasa
Hi, I fixed the patch to improve followings. - Can compile in MacOS. - Change GUC name enable_kernel_readahead to readahead_strategy. - Change POSIX_FADV_SEQUNENTIAL to POISX_FADV_NORMAL when we select sequential access strategy, this reason is later... I tested simple two access

Re: [HACKERS] pg_rewarm status

2013-12-17 Thread KONDO Mitsumasa
(2013/12/18 5:33), Robert Haas wrote: Sounds like it might be worth dusting the patch off again... I'd like to request you to add all_index option and usage_count option. When all_index option is selected, all index become rewarm nevertheless user doesn't input relation name. And usage_count

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2013-12-17 Thread KONDO Mitsumasa
(2013/12/17 21:29), Simon Riggs wrote: These are interesting results. Good research. Thanks! They also show that the benefit of this is very specific to the exact task being performed. I can't see any future for a setting that applies to everything or nothing. We must be more selective. This

Re: [HACKERS] Optimize kernel readahead using buffer access strategy

2014-01-14 Thread KONDO Mitsumasa
Hi, I fix and submit this patch in CF4. In my past patch, it is significant bug which is mistaken caluculation of offset in posix_fadvise():-( However it works well without problem in pgbench. Because pgbench transactions are always random access... And I test my patch in DBT-2 benchmark.

[HACKERS] drop duplicate buffers in OS

2014-01-14 Thread KONDO Mitsumasa
Hi, I create patch that can drop duplicate buffers in OS using usage_count alogorithm. I have developed this patch since last summer. This feature seems to be discussed in hot topic, so I submit it more faster than my schedule. When usage_count is high in shared_buffers, they are hard to drop

Re: [HACKERS] drop duplicate buffers in OS

2014-01-16 Thread KONDO Mitsumasa
(2014/01/16 21:38), Aidan Van Dyk wrote: Can we just get the backend that dirties the page to the posix_fadvice DONTNEED? No, it can remove clean page in OS file caches. Because if page is dirtied, it cause physical-disk-writing. However, it is experimental patch so it might be changed by

Re: [HACKERS] drop duplicate buffers in OS

2014-01-16 Thread KONDO Mitsumasa
(2014/01/16 3:34), Robert Haas wrote: On Wed, Jan 15, 2014 at 1:53 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote: I create patch that can drop duplicate buffers in OS using usage_count alogorithm. I have developed this patch since last summer. This feature seems to be discussed in hot

  1   2   >