Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Hi Rick, You Always have a doubt, if backup is full and done. In my case, I have one Client using a large database, and the base in mysql, an when we stop traffic about 8:00PM start a lot of routines. I have about 2hours of Exclusive lock. For solve one problem like these in this Client. We prepared a replication on other servers, and doing a backup in a replications servers. In My Opinion Postgres Replication work better way than mysql. https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling Pros.: You have a second and third server and (n). Can compare data and have a control of replication log. And dont need do a directly backup in production enviroment Best Regards Gustavo Neves Vargas gust...@justblue.com.br <mailto:gust...@justblue.com.br> justblue.com.br +55(41) 9157-7816 +55(41) 3058-4967 > On 2 Mar 2017, at 13:26, l...@laurent-hasson.com wrote: > > It'd be so nice to have some checks to guarantee the backup is trustworthy. > Restoring the db is imho not a very good option in general: > - large databases are a problem. My db is about 3TB. Time plus disk space is > a big blocker. > - also, what if the backup is incomplete? Just restoring the db successfully > is not enough right? You'd have to compare with the prod to make sure nothing > was missed... in a fast moving outfit where the db today will have tons of > new/changed deleted stuff from yesterday.. how to even do that? > > I am in a warehouse environment, so I have given up on guaranteeing backups > and in a case of trouble, i'll spend 20h rebuilding my db. So I have a way > out but i'd much prefer working with trustworthy backups. > > > Sent from my BlackBerry 10 smartphone. > From: Rick Otten > Sent: Thursday, March 2, 2017 08:19 > To: Dinesh Chandra 12108 > Cc: pgsql-performance@postgresql.org > Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or > not ? > > This reminds me - I have had a case where the exit code for pg_dump was > successful, but the backup was still corrupted on disk. By all means check > the exit code, but I strong encourage a second validation, such as the index > listing, to increase your confidence that the backup was successful. > > The best way to ensure good backups is to establish a regular practice of > restoring a backup to another database. The easiest such practice to justify > and implement is to maintain a developer/development database, and to use > your production database backups to rebuild it on a regular basis. Other > approaches could include regularly scheduled Disaster Recovery exercises, or > simply spinning up throw away cloud instances for the purpose. > > pg_dump uses the ordinary postgresql COPY command to extract data from the > tables. Beyond that, I'm not sure how it works. Sorry I can't help you > there. > > > On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com <mailto:dinesh.chan...@cyient.com>> wrote: > Hi, > > When I issue the bleow command > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > I checked with Linux TOP command on the same server, it was showing COPY > database. > What exactly it doing ?? > > Regards, > Dinesh Chandra > > -Original Message- > From: vinny [mailto:vi...@xs4all.nl <mailto:vi...@xs4all.nl>] > Sent: 27 February, 2017 7:31 PM > To: John Gorman <jgor...@eldocomp.com <mailto:jgor...@eldocomp.com>> > Cc: Rick Otten <rottenwindf...@gmail.com <mailto:rottenwindf...@gmail.com>>; > Dinesh Chandra 12108 <dinesh.chan...@cyient.com > <mailto:dinesh.chan...@cyient.com>>; pgsql-performance@postgresql.org > <mailto:pgsql-performance@postgresql.org>; > pgsql-performance-ow...@postgresql.org > <mailto:pgsql-performance-ow...@postgresql.org> > Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or > not ? > > On 2017-02-27 14:29, John Gorman wrote: > > Even though it's not listed in any of the documentation or “pg_dump > > --help” you can check the return code of the process. A return code > > greater than 0 (zero) usually indicates a failure > > > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > > > 1 > > > > FROM: pgsql-performance-ow...@postgresql.org > > <mailto:pgsql-performance-ow...@postgresql.org> > > [mailto:pgsql-performance-ow...@postgresql.org > > <mailto:pgsql-performance-ow...@postgresql.org>] ON BEHALF OF Rick > > Otten > > SENT: Monday, February 27, 2017 3:36 AM > > TO: Dinesh Chandra 12108 > > CC: pgsql-performance@postgresql.org > > <mailto:pgsql-performance@postg
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
May you please share what types of check if there to guarantee the backup is trustworthy. Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida. From: l...@laurent-hasson.com [mailto:l...@laurent-hasson.com] Sent: 02 March, 2017 9:57 PM To: Rick Otten <rottenwindf...@gmail.com>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com> Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? It'd be so nice to have some checks to guarantee the backup is trustworthy. Restoring the db is imho not a very good option in general: - large databases are a problem. My db is about 3TB. Time plus disk space is a big blocker. - also, what if the backup is incomplete? Just restoring the db successfully is not enough right? You'd have to compare with the prod to make sure nothing was missed... in a fast moving outfit where the db today will have tons of new/changed deleted stuff from yesterday.. how to even do that? I am in a warehouse environment, so I have given up on guaranteeing backups and in a case of trouble, i'll spend 20h rebuilding my db. So I have a way out but i'd much prefer working with trustworthy backups. Sent from my BlackBerry 10 smartphone. From: Rick Otten Sent: Thursday, March 2, 2017 08:19 To: Dinesh Chandra 12108 Cc: pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org> Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? This reminds me - I have had a case where the exit code for pg_dump was successful, but the backup was still corrupted on disk. By all means check the exit code, but I strong encourage a second validation, such as the index listing, to increase your confidence that the backup was successful. The best way to ensure good backups is to establish a regular practice of restoring a backup to another database. The easiest such practice to justify and implement is to maintain a developer/development database, and to use your production database backups to rebuild it on a regular basis. Other approaches could include regularly scheduled Disaster Recovery exercises, or simply spinning up throw away cloud instances for the purpose. pg_dump uses the ordinary postgresql COPY command to extract data from the tables. Beyond that, I'm not sure how it works. Sorry I can't help you there. On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: Hi, When I issue the bleow command > ./bin >pg_dump -U dummy_user dummy_database; echo $? I checked with Linux TOP command on the same server, it was showing COPY database. What exactly it doing ?? Regards, Dinesh Chandra -Original Message- From: vinny [mailto:vi...@xs4all.nl<mailto:vi...@xs4all.nl>] Sent: 27 February, 2017 7:31 PM To: John Gorman <jgor...@eldocomp.com<mailto:jgor...@eldocomp.com>> Cc: Rick Otten <rottenwindf...@gmail.com<mailto:rottenwindf...@gmail.com>>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>>; pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org>; pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? On 2017-02-27 14:29, John Gorman wrote: > Even though it's not listed in any of the documentation or “pg_dump > --help” you can check the return code of the process. A return code > greater than 0 (zero) usually indicates a failure > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > 1 > > FROM: > pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> > [mailto:pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org>] > ON BEHALF OF Rick > Otten > SENT: Monday, February 27, 2017 3:36 AM > TO: Dinesh Chandra 12108 > CC: pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org> > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > successfully or not ? > > Although it doesn't really tell if the pg_dump was successful (you'll > need to do a full restore to be sure), I generate an archive list. If > that fails, the backup clearly wasn't successful, and if it succeeds, > odds are pretty good that it worked: > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: > > Hi, > > We are taking daily full backup of PostgreSQL database using PG_DUMP > which is automatic scheduled through Cronjobs. > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > RE
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
It'd be so nice to have some checks to guarantee the backup is trustworthy. Restoring the db is imho not a very good option in general: - large databases are a problem. My db is about 3TB. Time plus disk space is a big blocker. - also, what if the backup is incomplete? Just restoring the db successfully is not enough right? You'd have to compare with the prod to make sure nothing was missed... in a fast moving outfit where the db today will have tons of new/changed deleted stuff from yesterday.. how to even do that? I am in a warehouse environment, so I have given up on guaranteeing backups and in a case of trouble, i'll spend 20h rebuilding my db. So I have a way out but i'd much prefer working with trustworthy backups. Sent from my BlackBerry 10 smartphone. From: Rick Otten Sent: Thursday, March 2, 2017 08:19 To: Dinesh Chandra 12108 Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? This reminds me - I have had a case where the exit code for pg_dump was successful, but the backup was still corrupted on disk. By all means check the exit code, but I strong encourage a second validation, such as the index listing, to increase your confidence that the backup was successful. The best way to ensure good backups is to establish a regular practice of restoring a backup to another database. The easiest such practice to justify and implement is to maintain a developer/development database, and to use your production database backups to rebuild it on a regular basis. Other approaches could include regularly scheduled Disaster Recovery exercises, or simply spinning up throw away cloud instances for the purpose. pg_dump uses the ordinary postgresql COPY command to extract data from the tables. Beyond that, I'm not sure how it works. Sorry I can't help you there. On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: Hi, When I issue the bleow command > ./bin >pg_dump -U dummy_user dummy_database; echo $? I checked with Linux TOP command on the same server, it was showing COPY database. What exactly it doing ?? Regards, Dinesh Chandra -Original Message- From: vinny [mailto:vi...@xs4all.nl<mailto:vi...@xs4all.nl>] Sent: 27 February, 2017 7:31 PM To: John Gorman <jgor...@eldocomp.com<mailto:jgor...@eldocomp.com>> Cc: Rick Otten <rottenwindf...@gmail.com<mailto:rottenwindf...@gmail.com>>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>>; pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org>; pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? On 2017-02-27 14:29, John Gorman wrote: > Even though it's not listed in any of the documentation or “pg_dump > --help” you can check the return code of the process. A return code > greater than 0 (zero) usually indicates a failure > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > 1 > > FROM: > pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> > [mailto:pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org>] > ON BEHALF OF Rick > Otten > SENT: Monday, February 27, 2017 3:36 AM > TO: Dinesh Chandra 12108 > CC: pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org> > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > successfully or not ? > > Although it doesn't really tell if the pg_dump was successful (you'll > need to do a full restore to be sure), I generate an archive list. If > that fails, the backup clearly wasn't successful, and if it succeeds, > odds are pretty good that it worked: > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: > > Hi, > > We are taking daily full backup of PostgreSQL database using PG_DUMP > which is automatic scheduled through Cronjobs. > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > REGARDS, > > DINESH CHANDRA > > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. It's important to note the distinction between "the backup process did not fail" and "we now have a trustworthy backup" And you can go full-paranoia and say that you can successfully create a perfectly working backup of the wrong database. So what is it that you want to make sure of: 1. Did the process give an error? 2. Did the process create a usable backup? What are the chances of #1 reporting success b
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Dear Rick, Thanks for your valuable reply. But the daily restoration of backup to another database is really so time consuming because our databases size is greater than 2TB. Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| From: Rick Otten [mailto:rottenwindf...@gmail.com] Sent: 02 March, 2017 6:45 PM To: Dinesh Chandra 12108 <dinesh.chan...@cyient.com> Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? This reminds me - I have had a case where the exit code for pg_dump was successful, but the backup was still corrupted on disk. By all means check the exit code, but I strong encourage a second validation, such as the index listing, to increase your confidence that the backup was successful. The best way to ensure good backups is to establish a regular practice of restoring a backup to another database. The easiest such practice to justify and implement is to maintain a developer/development database, and to use your production database backups to rebuild it on a regular basis. Other approaches could include regularly scheduled Disaster Recovery exercises, or simply spinning up throw away cloud instances for the purpose. pg_dump uses the ordinary postgresql COPY command to extract data from the tables. Beyond that, I'm not sure how it works. Sorry I can't help you there. On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: Hi, When I issue the bleow command > ./bin >pg_dump -U dummy_user dummy_database; echo $? I checked with Linux TOP command on the same server, it was showing COPY database. What exactly it doing ?? Regards, Dinesh Chandra -Original Message- From: vinny [mailto:vi...@xs4all.nl<mailto:vi...@xs4all.nl>] Sent: 27 February, 2017 7:31 PM To: John Gorman <jgor...@eldocomp.com<mailto:jgor...@eldocomp.com>> Cc: Rick Otten <rottenwindf...@gmail.com<mailto:rottenwindf...@gmail.com>>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>>; pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org>; pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? On 2017-02-27 14:29, John Gorman wrote: > Even though it's not listed in any of the documentation or “pg_dump > --help” you can check the return code of the process. A return code > greater than 0 (zero) usually indicates a failure > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > 1 > > FROM: > pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org> > [mailto:pgsql-performance-ow...@postgresql.org<mailto:pgsql-performance-ow...@postgresql.org>] > ON BEHALF OF Rick > Otten > SENT: Monday, February 27, 2017 3:36 AM > TO: Dinesh Chandra 12108 > CC: pgsql-performance@postgresql.org<mailto:pgsql-performance@postgresql.org> > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > successfully or not ? > > Although it doesn't really tell if the pg_dump was successful (you'll > need to do a full restore to be sure), I generate an archive list. If > that fails, the backup clearly wasn't successful, and if it succeeds, > odds are pretty good that it worked: > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: > > Hi, > > We are taking daily full backup of PostgreSQL database using PG_DUMP > which is automatic scheduled through Cronjobs. > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > REGARDS, > > DINESH CHANDRA > > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. It's important to note the distinction between "the backup process did not fail" and "we now have a trustworthy backup" And you can go full-paranoia and say that you can successfully create a perfectly working backup of the wrong database. So what is it that you want to make sure of: 1. Did the process give an error? 2. Did the process create a usable backup? What are the chances of #1 reporting success but still producing a bad backup? And can #2 fail on a good database, and if so, can you detect that? DISCLAIMER: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original mes
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
This reminds me - I have had a case where the exit code for pg_dump was successful, but the backup was still corrupted on disk. By all means check the exit code, but I strong encourage a second validation, such as the index listing, to increase your confidence that the backup was successful. The best way to ensure good backups is to establish a regular practice of restoring a backup to another database. The easiest such practice to justify and implement is to maintain a developer/development database, and to use your production database backups to rebuild it on a regular basis. Other approaches could include regularly scheduled Disaster Recovery exercises, or simply spinning up throw away cloud instances for the purpose. pg_dump uses the ordinary postgresql COPY command to extract data from the tables. Beyond that, I'm not sure how it works. Sorry I can't help you there. On Thu, Mar 2, 2017 at 7:05 AM, Dinesh Chandra 12108 < dinesh.chan...@cyient.com> wrote: > Hi, > > When I issue the bleow command > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > I checked with Linux TOP command on the same server, it was showing COPY > database. > What exactly it doing ?? > > Regards, > Dinesh Chandra > > -Original Message- > From: vinny [mailto:vi...@xs4all.nl] > Sent: 27 February, 2017 7:31 PM > To: John Gorman <jgor...@eldocomp.com> > Cc: Rick Otten <rottenwindf...@gmail.com>; Dinesh Chandra 12108 < > dinesh.chan...@cyient.com>; pgsql-performance@postgresql.org; > pgsql-performance-ow...@postgresql.org > Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully > or not ? > > On 2017-02-27 14:29, John Gorman wrote: > > Even though it's not listed in any of the documentation or “pg_dump > > --help” you can check the return code of the process. A return code > > greater than 0 (zero) usually indicates a failure > > > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > > > 1 > > > > FROM: pgsql-performance-ow...@postgresql.org > > [mailto:pgsql-performance-ow...@postgresql.org] ON BEHALF OF Rick > > Otten > > SENT: Monday, February 27, 2017 3:36 AM > > TO: Dinesh Chandra 12108 > > CC: pgsql-performance@postgresql.org > > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > > successfully or not ? > > > > Although it doesn't really tell if the pg_dump was successful (you'll > > need to do a full restore to be sure), I generate an archive list. If > > that fails, the backup clearly wasn't successful, and if it succeeds, > > odds are pretty good that it worked: > > > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > > <dinesh.chan...@cyient.com> wrote: > > > > Hi, > > > > We are taking daily full backup of PostgreSQL database using PG_DUMP > > which is automatic scheduled through Cronjobs. > > > > How can I check my yesterday backup is successfully or not? > > > > Is there any query or view by which I can check it? > > > > REGARDS, > > > > DINESH CHANDRA > > > > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. > > > It's important to note the distinction between > > "the backup process did not fail" > > and > > "we now have a trustworthy backup" > > And you can go full-paranoia and say that you can successfully create a > perfectly working backup of the wrong database. > > So what is it that you want to make sure of: > 1. Did the process give an error? > 2. Did the process create a usable backup? > > What are the chances of #1 reporting success but still producing a bad > backup? > And can #2 fail on a good database, and if so, can you detect that? > > > > > > DISCLAIMER: > > This email message is for the sole use of the intended recipient(s) and > may contain confidential and privileged information. Any unauthorized > review, use, disclosure or distribution is prohibited. If you are not the > intended recipient, please contact the sender by reply email and destroy > all copies of the original message. Check all attachments for viruses > before opening them. All views or opinions presented in this e-mail are > those of the author and may not reflect the opinion of Cyient or those of > our affiliates. > > -- > Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance >
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Hi, When I issue the bleow command > ./bin >pg_dump -U dummy_user dummy_database; echo $? I checked with Linux TOP command on the same server, it was showing COPY database. What exactly it doing ?? Regards, Dinesh Chandra -Original Message- From: vinny [mailto:vi...@xs4all.nl] Sent: 27 February, 2017 7:31 PM To: John Gorman <jgor...@eldocomp.com> Cc: Rick Otten <rottenwindf...@gmail.com>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com>; pgsql-performance@postgresql.org; pgsql-performance-ow...@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? On 2017-02-27 14:29, John Gorman wrote: > Even though it's not listed in any of the documentation or “pg_dump > --help” you can check the return code of the process. A return code > greater than 0 (zero) usually indicates a failure > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > 1 > > FROM: pgsql-performance-ow...@postgresql.org > [mailto:pgsql-performance-ow...@postgresql.org] ON BEHALF OF Rick > Otten > SENT: Monday, February 27, 2017 3:36 AM > TO: Dinesh Chandra 12108 > CC: pgsql-performance@postgresql.org > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > successfully or not ? > > Although it doesn't really tell if the pg_dump was successful (you'll > need to do a full restore to be sure), I generate an archive list. If > that fails, the backup clearly wasn't successful, and if it succeeds, > odds are pretty good that it worked: > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com> wrote: > > Hi, > > We are taking daily full backup of PostgreSQL database using PG_DUMP > which is automatic scheduled through Cronjobs. > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > REGARDS, > > DINESH CHANDRA > > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. It's important to note the distinction between "the backup process did not fail" and "we now have a trustworthy backup" And you can go full-paranoia and say that you can successfully create a perfectly working backup of the wrong database. So what is it that you want to make sure of: 1. Did the process give an error? 2. Did the process create a usable backup? What are the chances of #1 reporting success but still producing a bad backup? And can #2 fail on a good database, and if so, can you detect that? DISCLAIMER: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Hi, I run the below command [postgres@CPPMoma-DB05 bin]$ pg_dump -U postgres moma_ts_oce; echo $ Output was like this(Last few lines ) -- -- Name: public; Type: ACL; Schema: -; Owner: postgres -- REVOKE ALL ON SCHEMA public FROM PUBLIC; REVOKE ALL ON SCHEMA public FROM postgres; GRANT ALL ON SCHEMA public TO postgres; GRANT ALL ON SCHEMA public TO PUBLIC; -- -- PostgreSQL database dump complete -- $ Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida. -- Mobile: +91-9953975849 | Ext 1078 |dinesh.chan...@cyient.com Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India. -Original Message- From: vinny [mailto:vi...@xs4all.nl] Sent: 27 February, 2017 7:31 PM To: John Gorman <jgor...@eldocomp.com> Cc: Rick Otten <rottenwindf...@gmail.com>; Dinesh Chandra 12108 <dinesh.chan...@cyient.com>; pgsql-performance@postgresql.org; pgsql-performance-ow...@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? On 2017-02-27 14:29, John Gorman wrote: > Even though it's not listed in any of the documentation or “pg_dump > --help” you can check the return code of the process. A return code > greater than 0 (zero) usually indicates a failure > > ./bin >pg_dump -U dummy_user dummy_database; echo $? > > 1 > > FROM: pgsql-performance-ow...@postgresql.org > [mailto:pgsql-performance-ow...@postgresql.org] ON BEHALF OF Rick > Otten > SENT: Monday, February 27, 2017 3:36 AM > TO: Dinesh Chandra 12108 > CC: pgsql-performance@postgresql.org > SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is > successfully or not ? > > Although it doesn't really tell if the pg_dump was successful (you'll > need to do a full restore to be sure), I generate an archive list. If > that fails, the backup clearly wasn't successful, and if it succeeds, > odds are pretty good that it worked: > > On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 > <dinesh.chan...@cyient.com> wrote: > > Hi, > > We are taking daily full backup of PostgreSQL database using PG_DUMP > which is automatic scheduled through Cronjobs. > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > REGARDS, > > DINESH CHANDRA > > |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. It's important to note the distinction between "the backup process did not fail" and "we now have a trustworthy backup" And you can go full-paranoia and say that you can successfully create a perfectly working backup of the wrong database. So what is it that you want to make sure of: 1. Did the process give an error? 2. Did the process create a usable backup? What are the chances of #1 reporting success but still producing a bad backup? And can #2 fail on a good database, and if so, can you detect that? DISCLAIMER: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
On 2017-02-27 14:29, John Gorman wrote: Even though it's not listed in any of the documentation or “pg_dump --help” you can check the return code of the process. A return code greater than 0 (zero) usually indicates a failure ./bin >pg_dump -U dummy_user dummy_database; echo $? 1 FROM: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] ON BEHALF OF Rick Otten SENT: Monday, February 27, 2017 3:36 AM TO: Dinesh Chandra 12108 CC: pgsql-performance@postgresql.org SUBJECT: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? Although it doesn't really tell if the pg_dump was successful (you'll need to do a full restore to be sure), I generate an archive list. If that fails, the backup clearly wasn't successful, and if it succeeds, odds are pretty good that it worked: On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 <dinesh.chan...@cyient.com> wrote: Hi, We are taking daily full backup of PostgreSQL database using PG_DUMP which is automatic scheduled through Cronjobs. How can I check my yesterday backup is successfully or not? Is there any query or view by which I can check it? REGARDS, DINESH CHANDRA |DATABASE ADMINISTRATOR (ORACLE/POSTGRESQL)| CYIENT LTD. NOIDA. It's important to note the distinction between "the backup process did not fail" and "we now have a trustworthy backup" And you can go full-paranoia and say that you can successfully create a perfectly working backup of the wrong database. So what is it that you want to make sure of: 1. Did the process give an error? 2. Did the process create a usable backup? What are the chances of #1 reporting success but still producing a bad backup? And can #2 fail on a good database, and if so, can you detect that? -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Even though it's not listed in any of the documentation or “pg_dump --help” you can check the return code of the process. A return code greater than 0 (zero) usually indicates a failure ./bin >pg_dump -U dummy_user dummy_database; echo $? 1 From: pgsql-performance-ow...@postgresql.org [mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Rick Otten Sent: Monday, February 27, 2017 3:36 AM To: Dinesh Chandra 12108 Cc: pgsql-performance@postgresql.org Subject: Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ? Although it doesn't really tell if the pg_dump was successful (you'll need to do a full restore to be sure), I generate an archive list. If that fails, the backup clearly wasn't successful, and if it succeeds, odds are pretty good that it worked: -- bash code snippet -- archiveList=`pg_restore -l ${backupFolder}` if [[ ! ${archiveList} =~ "Archive created at" ]] then echo "PostgreSQL backup - Archive List Test Failed for ${hostName}:${dbName}" echo "Archive listing:" echo ${archiveList} exit 1 fi --- On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 <dinesh.chan...@cyient.com<mailto:dinesh.chan...@cyient.com>> wrote: Hi, We are taking daily full backup of PostgreSQL database using PG_DUMP which is automatic scheduled through Cronjobs. How can I check my yesterday backup is successfully or not? Is there any query or view by which I can check it? Regards, Dinesh Chandra |Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida. -- Mobile: +91-9953975849<tel:+91%2099539%2075849> | Ext 1078 |dinesh.chan...@cyient.com<mailto:%7cdinesh.chan...@cyient.com> Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India. DISCLAIMER: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Re: [PERFORM] How Can I check PostgreSQL backup is successfully or not ?
Although it doesn't really tell if the pg_dump was successful (you'll need to do a full restore to be sure), I generate an archive list. If that fails, the backup clearly wasn't successful, and if it succeeds, odds are pretty good that it worked: -- bash code snippet -- archiveList=`pg_restore -l ${backupFolder}` if [[ ! ${archiveList} =~ "Archive created at" ]] then echo "PostgreSQL backup - Archive List Test Failed for ${hostName}:${dbName}" echo "Archive listing:" echo ${archiveList} exit 1 fi --- On Mon, Feb 27, 2017 at 4:35 AM, Dinesh Chandra 12108 < dinesh.chan...@cyient.com> wrote: > Hi, > > > > We are taking daily full backup of PostgreSQL database using *PG_DUMP* > which is automatic scheduled through Cronjobs. > > > > How can I check my yesterday backup is successfully or not? > > Is there any query or view by which I can check it? > > > > *Regards,* > > *Dinesh Chandra* > > *|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.* > > *--* > > Mobile: +91-9953975849 <+91%2099539%2075849> | Ext 1078 > |dinesh.chan...@cyient.com > > Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India. > > > > -- > > DISCLAIMER: > > This email message is for the sole use of the intended recipient(s) and > may contain confidential and privileged information. Any unauthorized > review, use, disclosure or distribution is prohibited. If you are not the > intended recipient, please contact the sender by reply email and destroy > all copies of the original message. Check all attachments for viruses > before opening them. All views or opinions presented in this e-mail are > those of the author and may not reflect the opinion of Cyient or those of > our affiliates. >
Re: [PERFORM] How can I find the source of postgresql per-connection memory leaks?
On 01/12/2017 09:08 AM, Eric Jensen wrote: I'm using postgresql 9.5.4 on amazon RDS with ~1300 persistent connections from rails 4.2 with "prepared_statements: false". Over the enter image description here PostgreSQL on RDS is a closed product. My recommendation would be to contact Amazon support. They are likely to be able to provide you with better support. Sincerely, JD -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can I get the query planner to use a bitmap index scap instead of an index scan ?
On Mon, Mar 10, 2014 at 4:46 AM, Jeff Janes jeff.ja...@gmail.com wrote: On Fri, Mar 7, 2014 at 6:46 PM, Mohan Krishnan moha...@gmail.com wrote: Hello folks, I have a table of about 700k rows in Postgres 9.3.3, which has the following structure: Columns: content_body - text publish_date - timestamp without time zone published - boolean Indexes: articles_pkey PRIMARY KEY, btree (id) article_text_gin gin (article_text) articles_publish_date_id_index btree (publish_date DESC NULLS LAST, id DESC) Your indexes are on columns that are not in the list of columns you gave. Can you show us the actual table and index definitions? Sorry about that, here is the table and the index definitions Table public.articles Column|Type | Modifiers --+-+--- id | integer | not null default nextval('articles_id_seq'::regclass) title| text| not null content_body | text| publish_date | timestamp without time zone | created_at | timestamp without time zone | not null published| boolean | updated_at | timestamp without time zone | not null category_id | integer | not null article_text | tsvector| Indexes: articles_pkey PRIMARY KEY, btree (id) article_text_gin gin (article_text) articles_category_id_index btree (category_id) articles_created_at btree (created_at) articles_publish_date_id_index btree (publish_date DESC NULLS LAST, id DESC) articles_published_index btree (published) - Index Scan using articles_pkey on articles (cost=0.42..462150.49 rows=3573 width=1298) (actual time=2.055..9.711 rows=10 loops=1) Filter: (article_text @@ '''in_index'''::tsquery) ... - Index Scan using articles_pkey on articles (cost=0.42..462150.49 rows=3573 width=1298) (actual time=5633.672..5633.672 rows=0 loops=1) Filter: (article_text @@ '''not_in_index'''::tsquery) Those estimates are way off, and it is not clear why they would be. Have you analyzed your table recently? Yes I have analyzed them and rerun the queries - there is no difference. What more debugging information can should I look at to determine why the estimates are way off ? Cheers, Jeff -- Mohan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can I get the query planner to use a bitmap index scap instead of an index scan ?
On Fri, Mar 7, 2014 at 6:46 PM, Mohan Krishnan moha...@gmail.com wrote: Hello folks, I have a table of about 700k rows in Postgres 9.3.3, which has the following structure: Columns: content_body - text publish_date - timestamp without time zone published - boolean Indexes: articles_pkey PRIMARY KEY, btree (id) article_text_gin gin (article_text) articles_publish_date_id_index btree (publish_date DESC NULLS LAST, id DESC) Your indexes are on columns that are not in the list of columns you gave. Can you show us the actual table and index definitions? - Index Scan using articles_pkey on articles (cost=0.42..462150.49 rows=3573 width=1298) (actual time=2.055..9.711 rows=10 loops=1) Filter: (article_text @@ '''in_index'''::tsquery) ... - Index Scan using articles_pkey on articles (cost=0.42..462150.49 rows=3573 width=1298) (actual time=5633.672..5633.672 rows=0 loops=1) Filter: (article_text @@ '''not_in_index'''::tsquery) Those estimates are way off, and it is not clear why they would be. Have you analyzed your table recently? Cheers, Jeff
Re: [PERFORM] How can i find out top high load sql queries in PostgreSQL.
-Original Message- From: Vibhor Kumar [mailto:vibhor.ku...@enterprisedb.com] Sent: Monday, December 17, 2012 11:34 AM To: suhas.basavaraj12 Cc: pgsql-performance@postgresql.org Subject: Re: How can i find out top high load sql queries in PostgreSQL. On Dec 17, 2012, at 3:21 AM, suhas.basavaraj12 suha...@verse.in wrote: There is a tool called pg Fouine . I am sure this will help you.. http://pgfouine.projects.pgfoundry.org/tutorial.html +1 You can also use pgbadger, which seemed more flexible than pgFouine. http://dalibo.github.com/pgbadger/ Thanks Regards, Vibhor Kumar EnterpriseDB Corporation The Enterprise PostgreSQL Company Blog:http://vibhork.blogspot.com Pg_stat_statements extension tracks SQL statements execution statistics. Regards, Igor Neyman -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can i find out top high load sql queries in PostgreSQL.
There is a tool called pg Fouine . I am sure this will help you.. http://pgfouine.projects.pgfoundry.org/tutorial.html Rgrds Suhas -- View this message in context: http://postgresql.1045698.n5.nabble.com/How-can-i-find-out-top-high-load-sql-queries-in-PostgreSQL-tp5736854p5736865.html Sent from the PostgreSQL - performance mailing list archive at Nabble.com. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can i find out top high load sql queries in PostgreSQL.
On Dec 17, 2012, at 3:21 AM, suhas.basavaraj12 suha...@verse.in wrote: There is a tool called pg Fouine . I am sure this will help you.. http://pgfouine.projects.pgfoundry.org/tutorial.html +1 You can also use pgbadger, which seemed more flexible than pgFouine. http://dalibo.github.com/pgbadger/ Thanks Regards, Vibhor Kumar EnterpriseDB Corporation The Enterprise PostgreSQL Company Blog:http://vibhork.blogspot.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can i get record by data block not by sql?
On Oct 3, 2011, at 6:52 AM, 姜头 wrote: How can i get record by data block not by sql? I want to read and write lots of data by data blocks and write record to a appointed data block and read it. so i can form a disk-resident tree by recording the block address. But i don't know how to implement in postgresql. Is there system function can do this? Can someone help me?? Thank you very very much1 It sounds like you should look into the COPY command, or, if you're adventurous, the pg_bulkload project. They might get you the speed you're after, if not quite the implementation. But if what you're really after is to manipulate the table files directly - and I'm not sure why that would be a goal in itself - then perhaps SQL isn't for you. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can retrieve total query run-time with out using explain
On 3/07/2011 12:13 PM, Radhya sahal wrote: Dear all, How can retrieve total query run-time when i run any query with out using explain EXPLAIN doesn't tell you the query runtime. EXPLAIN ANALYZE does, but it doesn't return the query results, returning plan and timing information instead. for example when i run any query such as select * from tablel1; Total query runtime: 443 ms. in psql, use: \timing on i want a function can return runtime: 443 ms with out using explain i want this command to call it from java source code Record the system time before you run the query using System.currentTimeMillis() or System.nanoTime(). Record the system time after you run the query. Subtract and convert the difference to a java.util.Date so you can format it prettily for display, or just print the difference in milliseconds. If you want more flexible date/time handling, see JodaTime. This gives you the time the query took including how long it took to retrieve the initial resultset. If you want the time the query took to execute on the server, not counting fetch time, this may not be what you want. I don't know how to get query execution time *not* counting resultset retrieval time from the client. Anyone know if it's even possible? -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
On Mar 26, 2008, at 3:31 PM, Scott Marlowe wrote: On Wed, Mar 26, 2008 at 1:48 PM, p prince [EMAIL PROTECTED] wrote: is this 'normal'? (loaded question I know) Should I be looking to offload expensive reporting queries to read- only replicants of my database? Yes, definitely look into setting up something like a slony slave that's used for reporting queries. The nice thing about this setup is you only need to replicate the tables you run reports against. For simple two-node (i.e. no cascaded replication) I'd suggest looking into Londiste. It's loads easier to wrap your head around and it's extremely easy to add/remove tables from replication as it doesn't deal with table sets like Slony does. Erik Jones DBA | Emma® [EMAIL PROTECTED] 800.595.4401 or 615.292.5888 615.292.0777 (fax) Emma helps organizations everywhere communicate market in style. Visit us online at http://www.myemma.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
On Wednesday 26 March 2008, p prince [EMAIL PROTECTED] wrote: Is this a sign of disk contention? Yes. How does CPU load come into play? Processes waiting for disk I/O generally show up as load. -- Alan -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
On Wed, Mar 26, 2008 at 1:48 PM, p prince [EMAIL PROTECTED] wrote: is this 'normal'? (loaded question I know) Should I be looking to offload expensive reporting queries to read-only replicants of my database? Yes, definitely look into setting up something like a slony slave that's used for reporting queries. The nice thing about this setup is you only need to replicate the tables you run reports against. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
is this 'normal'? (loaded question I know) Depends. If you are on the edge, disk-wise, yes a big fat query can push it over and make it fall. Should I be looking to offload expensive reporting queries to read-only replicants of my database? You could do this, especially if the heavy queries involve reading gigabytes of data from disk (as reporting queries like to do). In that case, you can even use a cheap machine with cheap disks for the slave (even striped RAID) since data is duplicated anyway and all that matters is megabytes/second, not IOs/second. Is this a symptom of slow disk? vmstat will tell you this. If iowait time goes through the roof, yes it's disk bound. If cpu use goes 100%, then it's cpu bound. imporoperly tuned postgres settings? bad Also possible, you can try EXPLAIN of the problematic queries. choice of OS, hardware, storage? Depends on how your SAN handles load. No idea about that. Is this a sign of disk contention? Most probable. How does CPU load come into play? With 8 CPUs, less likely. (Your problem query can swamp at most 1 CPU, so if the machine grinds with still 7 other cores available for the usual, it probably isn't cpu-bound) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
Scott Marlowe wrote: On Wed, Mar 26, 2008 at 1:48 PM, p prince [EMAIL PROTECTED] wrote: is this 'normal'? (loaded question I know) Should I be looking to offload expensive reporting queries to read-only replicants of my database? Yes, definitely look into setting up something like a slony slave that's used for reporting queries. The nice thing about this setup is you only need to replicate the tables you run reports against. I would look at fixing the slow queries so that they aren't a problem first. I'm sure if you send in your queries and table defs you can get some useful feedback here. If there is no way of improving them then look at a reporting slave. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] how can a couple of expensive queries drag my system down?
On Wed, Mar 26, 2008 at 10:09 PM, Shane Ambler [EMAIL PROTECTED] wrote: Scott Marlowe wrote: On Wed, Mar 26, 2008 at 1:48 PM, p prince [EMAIL PROTECTED] wrote: is this 'normal'? (loaded question I know) Should I be looking to offload expensive reporting queries to read-only replicants of my database? Yes, definitely look into setting up something like a slony slave that's used for reporting queries. The nice thing about this setup is you only need to replicate the tables you run reports against. I would look at fixing the slow queries so that they aren't a problem first. I'm not sure you're reading the same thread as me. Or something. I've had reporting queries that took the better part of an hour to run, and this was completely normal. When you're running millions of rows against each other for reporting queries it's not unusual to blow out the cache. Maybe the queries are inefficient, and maybe they're not. But one should not be running reporting queries on a live transactional database. I'm sure if you send in your queries and table defs you can get some useful feedback here. If there is no way of improving them then look at a reporting slave. -- Shane Ambler pgSQL (at) Sheeky (dot) Biz Get Sheeky @ http://Sheeky.Biz -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] How can fixed and variable width columns perform similarly?
Siddharth Anand [EMAIL PROTECTED] writes: How can a field that doesn't have a limit like text perform similarly to char varying(128), for example? At some point, we need to write data to disk. The more data that needs to be written, the longer the disk write will take, especially when it requires finding free sectors to write to. What's your point? If you're not going to put more than 128 characters in the field, there's no difference in the amount of data involved. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] How can fixed and variable width columns perform similarly?
I think the manual is implying that if you store a value like Sid in a field either of type varchar(128) or type text there is no performance difference. The manual is not saying that you get the same performance storing a 500k text field as when you store the value Sid. Dave -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Siddharth Anand Sent: Friday, April 27, 2007 10:32 AM To: pgsql-performance@postgresql.org Subject: [PERFORM] How can fixed and variable width columns perform similarly? Hi! I read the link below and am puzzled by or curious about something. http://www.postgresql.org/docs/8.1/interactive/datatype-character.html The Tip below is intriguing Tip: There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead. How can a field that doesn't have a limit like text perform similarly to char varying(128), for example? At some point, we need to write data to disk. The more data that needs to be written, the longer the disk write will take, especially when it requires finding free sectors to write to. Another interesting quote from the same page is the following: Long values are also stored in background tables so they do not interfere with rapid access to the shorter column values. If the long values are stored in a separate table, on a different part of the disk, doesn't this imply an extra disk seek? Won't it therefore take longer? Sid ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How can fixed and variable width columns perform similarly?
Hi Tom, My question wasn't phrased clearly. Oracle exhibits a performance degradation for very large-sized fields (CLOB types that I equate to PostGres' text type) when compared with the performance of field types like varchar that handle a max character limit of a few thousand bytes in Oracle. It sounds like PostGres doesn't exhibit this same difference. I wanted to understand how this could be and whether there was a trade-off. Cheers! Sid Siddharth Anand [EMAIL PROTECTED] writes: How can a field that doesn't have a limit like text perform similarly to char varying(128), for example? At some point, we need to write data to disk. The more data that needs to be written, the longer the disk write will take, especially when it requires finding free sectors to write to. What's your point? If you're not going to put more than 128 characters in the field, there's no difference in the amount of data involved. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] How can fixed and variable width columns perform similarly?
Siddharth Anand [EMAIL PROTECTED] writes: My question wasn't phrased clearly. Oracle exhibits a performance degradation for very large-sized fields (CLOB types that I equate to PostGres' text type) when compared with the performance of field types like varchar that handle a max character limit of a few thousand bytes in Oracle. It sounds like PostGres doesn't exhibit this same difference. I wanted to understand how this could be and whether there was a trade-off. Ah. Well, the answer is that we change behavior dynamically depending on the size of the particular field value, instead of hard-wiring it to the declared column type. It sounds like Oracle's CLOB might be doing about the same thing as an out-of-line toasted field value in Postgres. In PG, text and varchar behave identically except that varchar(N) adds an insert-time check on the length of the field value --- but this is just a constraint check and doesn't have any direct influence on how the value is stored. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How can I make this query faster (resend)
On Mon, May 29, 2006 at 07:35:14AM -0700, Cstdenis wrote: To be honest, you're pushing things expecting a machine with only 1G to serve 300 active connections. How large is the database itself? The database is 3.7G on disk. There is about 1G of actual data in it -- the rest is dead tuples and indices. (I vacuum regularly, but a vacuum full causes too much downtime to do unless I have to) It sounds like you're not vacuuming anywhere near regularly enough if you have that much dead space. You should at least reindex. I know hyperthreading is considered something that can slow down a server but with my very high concurancy (averages about 400-500 concurant users during peak hours) I am hoping the extra virtual CPUs wil help. Anyone have experance that says diferent at high concurancy? Best bet is to try it and see. Generally, people find HT hurts, but I recently saw it double the performance of pgbench on a windows XP machine, so it's possible that windows is just more clever about how to use it than linux is. Anyone know if those who have found it hurts are low concurancy complex cpu intensive queries or high concurancy simple queries or both? I can understand it hurting in the former, but not the later. I'll have to give it a try I guess. It should at least help my very high load averages. The issue is that HT doesn't give you anything close to having 2 CPUs, so for all but the most trivial and limited cases it's not going to be a win. Incidentally, the only good results I've seen with HT are on windows. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How can I make this query faster (resend)
(re-sending because my first one forgot the CC to the list. Sorry) I moved my database to a more powerful server. Mirrored ultra 320 SCSI HDs and 2GB of ram. It performs much faster. I also changed some conf settings accordingly work_mem = 10240 shared_buffers = 25600 max_connections = 450 (Also moved the webserver and needed more connections during the DNS propagation). I've been looking into pgpool. If I understand things correctly I can have persistent connections from all 256 apache processes to a pgpool and it can have like 30 persistent connections to the actual server thus saving lots of server memory (due to very high concurrency I would probably actually use at least 100) Is this correct? However, memory doesn't seem to be my problem anymore, the query is still taking longer than I'd like for the larger categories (6 seconds for one with 1300 pictures) but its more managable. The problem now is that my server's load average during peak hours has gone as high as 30 (tho the server seems to still be responding fairly quickly it still worrysome) Given my new server specs can anyone suggest any other config file improvements? Perhaps some of the *_cost variables could be adjusted to better reflect my server's hardware? - Original Message - From: Markus Schaber [EMAIL PROTECTED] To: Cstdenis [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Monday, May 22, 2006 8:30 AM Subject: Re: [PERFORM] How can I make this query faster (resend) Hi, Cstendis, Cstdenis wrote: Server is a dual xeon with a gig of ram dedicated mostly to postgresql. Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222 3M is really low for a production server. Try using pg_pool and limiting it to about 30 or so backend connections, and then give them at least 30 megs of RAM each. This should also cut down the connection creation overhead. HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] How can I make this query faster (resend)
(Resending because my other send didn't get a CC to the list) From: Jim C. Nasby [EMAIL PROTECTED] To: Cstdenis [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Monday, May 22, 2006 8:20 AM Subject: Re: [PERFORM] How can I make this query faster (resend) On Fri, May 19, 2006 at 03:56:49PM -0700, Cstdenis wrote: (Its been a hour and I dont see my message on the list so I'm sending it again. I've moved the queries and analyze out of the email incase it was rejected because too long) query: http://pastebin.ca/57218 In the pictures table all the ratings have a shared index CREATE INDEX idx_rating ON pictures USING btree (rating_nudity, rating_violence, rating_sex, rating_racism, rating_spoilers, rating_yaoi, rating_yuri, rating_profanity); and approved and date_submitted and user_id also have their own btree indexes. In the picture_categories table pid and cat_id have their own btree indices plus one together. Full table definition: http://pastebin.ca/57219 the cat_id and rating values vary from query to query. The one listed above took 54 seconds in a test run just now. Here is explain analyze: http://pastebin.ca/57220 pictures is the interesting table here. It looks like the planner would do better to choose something other than a nested loop on it. Try running EXPLAIN ANALYZE on the query with enable_nestloop=off and see what you get (you'll need to compare it to what you get with enable_nestloop on to see what the change is). With enable_nestloop=off the same query as is explained further down in this email took much longer 63 seconds insted of 6. It decided to do sequencial scans on pictures and users with nested loop disabled. Merge Join (cost=146329.63..146963.96 rows=231 width=66) (actual time=61610.538..62749.176 rows=1305 loops=1) Merge Cond: (outer.user_id = inner.user_id) - Sort (cost=123828.88..123829.46 rows=231 width=47) (actual time=60445.367..60451.176 rows=1305 loops=1) Sort Key: pictures.user_id - Hash Join (cost=634.36..123819.81 rows=231 width=47) (actual time=128.088..60423.623 rows=1305 loops=1) Hash Cond: (outer.pid = inner.pid) - Seq Scan on pictures (cost=0.00..121670.43 rows=302543 width=47) (actual time=0.210..58795.925 rows=291318 loops=1) - Hash (cost=633.78..633.78 rows=231 width=4) (actual time=38.443..38.443 rows=1305 loops=1) - Bitmap Heap Scan on picture_categories (cost=2.81..633.78 rows=231 width=4) (actual time=4.753..32.259 rows=1305 loops=1) Recheck Cond: (cat_id = 182) - Bitmap Index Scan on idx_picture_categories_cat_id (cost=0.00..2.81 rows=231 width=0) (actual time=4.398..4.398 rows=1305 loops=1) Index Cond: (cat_id = 182) - Sort (cost=22500.74..22816.79 rows=126418 width=23) (actual time=1163.788..1505.104 rows=52214 loops=1) Sort Key: users.user_id - Seq Scan on users (cost=0.00..11788.18 rows=126418 width=23) (actual time=0.017..692.992 rows=54605 loops=1) Total runtime: 62776.720 ms Both pictures and picture categories have about 287,000 rows This query needs to run in under about a second or it kills my site by clogging apache slots (apache maxes out at 256 and I can have several hundred people on my site at a time). How can I make it run faster? Server is a dual xeon with a gig of ram dedicated mostly to postgresql. Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222 I suspect the low work_mem may be why it's using a nested loop. In addition to the test above, it would be interesting to see what happens to the plan if you set work_mem to 1. I moved to a more powerful server (2gb ram and mirrored scsi HDs) and upped the work mem to 10mb. Its much faster now, however its still doing a nested loop. (see also my reply to Markus Schaber) Nested Loop (cost=2.81..3398.76 rows=231 width=66) (actual time=14.946..5797.701 rows=1305 loops=1) - Nested Loop (cost=2.81..2022.71 rows=231 width=47) (actual time=14.551..5181.042 rows=1305 loops=1) - Bitmap Heap Scan on picture_categories (cost=2.81..633.78 rows=231 width=4) (actual time=9.966..140.606 rows=1305 loops=1) Recheck Cond: (cat_id = 182) - Bitmap Index Scan on idx_picture_categories_cat_id (cost=0.00..2.81 rows=231 width=0) (actual time=9.720..9.720 rows=1305 loops=1) Index Cond: (cat_id = 182) - Index Scan using pictures_pkey on pictures (cost=0.00..6.00 rows=1 width=47) (actual time=3.802..3.820 rows=1 loops=1305) Index Cond: (pictures.pid = outer.pid) - Index Scan using users_pkey on users (cost=0.00..5.94 rows=1 width=23) (actual time=0.095..0.100 rows=1 loops=1305) Index Cond: (outer.user_id = users.user_id) Total runtime: 5812.238 ms To be honest, you're pushing things expecting a machine with only 1G to serve
Re: [PERFORM] How can I make this query faster (resend)
From: Jim C. Nasby [EMAIL PROTECTED] To: Cstdenis [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Monday, May 22, 2006 8:20 AM Subject: Re: [PERFORM] How can I make this query faster (resend) On Fri, May 19, 2006 at 03:56:49PM -0700, Cstdenis wrote: (Its been a hour and I dont see my message on the list so I'm sending it again. I've moved the queries and analyze out of the email incase it was rejected because too long) query: http://pastebin.ca/57218 In the pictures table all the ratings have a shared index CREATE INDEX idx_rating ON pictures USING btree (rating_nudity, rating_violence, rating_sex, rating_racism, rating_spoilers, rating_yaoi, rating_yuri, rating_profanity); and approved and date_submitted and user_id also have their own btree indexes. In the picture_categories table pid and cat_id have their own btree indices plus one together. Full table definition: http://pastebin.ca/57219 the cat_id and rating values vary from query to query. The one listed above took 54 seconds in a test run just now. Here is explain analyze: http://pastebin.ca/57220 pictures is the interesting table here. It looks like the planner would do better to choose something other than a nested loop on it. Try running EXPLAIN ANALYZE on the query with enable_nestloop=off and see what you get (you'll need to compare it to what you get with enable_nestloop on to see what the change is). With enable_nestloop=off the same query as is explained further down in this email took much longer 63 seconds insted of 6. It decided to do sequencial scans on pictures and users with nested loop disabled. Merge Join (cost=146329.63..146963.96 rows=231 width=66) (actual time=61610.538..62749.176 rows=1305 loops=1) Merge Cond: (outer.user_id = inner.user_id) - Sort (cost=123828.88..123829.46 rows=231 width=47) (actual time=60445.367..60451.176 rows=1305 loops=1) Sort Key: pictures.user_id - Hash Join (cost=634.36..123819.81 rows=231 width=47) (actual time=128.088..60423.623 rows=1305 loops=1) Hash Cond: (outer.pid = inner.pid) - Seq Scan on pictures (cost=0.00..121670.43 rows=302543 width=47) (actual time=0.210..58795.925 rows=291318 loops=1) - Hash (cost=633.78..633.78 rows=231 width=4) (actual time=38.443..38.443 rows=1305 loops=1) - Bitmap Heap Scan on picture_categories (cost=2.81..633.78 rows=231 width=4) (actual time=4.753..32.259 rows=1305 loops=1) Recheck Cond: (cat_id = 182) - Bitmap Index Scan on idx_picture_categories_cat_id (cost=0.00..2.81 rows=231 width=0) (actual time=4.398..4.398 rows=1305 loops=1) Index Cond: (cat_id = 182) - Sort (cost=22500.74..22816.79 rows=126418 width=23) (actual time=1163.788..1505.104 rows=52214 loops=1) Sort Key: users.user_id - Seq Scan on users (cost=0.00..11788.18 rows=126418 width=23) (actual time=0.017..692.992 rows=54605 loops=1) Total runtime: 62776.720 ms Both pictures and picture categories have about 287,000 rows This query needs to run in under about a second or it kills my site by clogging apache slots (apache maxes out at 256 and I can have several hundred people on my site at a time). How can I make it run faster? Server is a dual xeon with a gig of ram dedicated mostly to postgresql. Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222 I suspect the low work_mem may be why it's using a nested loop. In addition to the test above, it would be interesting to see what happens to the plan if you set work_mem to 1. I moved to a more powerful server (2gb ram and mirrored scsi HDs) and upped the work mem to 10mb. Its much faster now, however its still doing a nested loop. (see also my reply to Markus Schaber) Nested Loop (cost=2.81..3398.76 rows=231 width=66) (actual time=14.946..5797.701 rows=1305 loops=1) - Nested Loop (cost=2.81..2022.71 rows=231 width=47) (actual time=14.551..5181.042 rows=1305 loops=1) - Bitmap Heap Scan on picture_categories (cost=2.81..633.78 rows=231 width=4) (actual time=9.966..140.606 rows=1305 loops=1) Recheck Cond: (cat_id = 182) - Bitmap Index Scan on idx_picture_categories_cat_id (cost=0.00..2.81 rows=231 width=0) (actual time=9.720..9.720 rows=1305 loops=1) Index Cond: (cat_id = 182) - Index Scan using pictures_pkey on pictures (cost=0.00..6.00 rows=1 width=47) (actual time=3.802..3.820 rows=1 loops=1305) Index Cond: (pictures.pid = outer.pid) - Index Scan using users_pkey on users (cost=0.00..5.94 rows=1 width=23) (actual time=0.095..0.100 rows=1 loops=1305) Index Cond: (outer.user_id = users.user_id) Total runtime: 5812.238 ms To be honest, you're pushing things expecting a machine with only 1G to serve 300 active connections. How large is the database itself
Re: [PERFORM] How can I make this query faster (resend)
On Fri, May 19, 2006 at 03:56:49PM -0700, Cstdenis wrote: (Its been a hour and I dont see my message on the list so I'm sending it again. I've moved the queries and analyze out of the email incase it was rejected because too long) query: http://pastebin.ca/57218 In the pictures table all the ratings have a shared index CREATE INDEX idx_rating ON pictures USING btree (rating_nudity, rating_violence, rating_sex, rating_racism, rating_spoilers, rating_yaoi, rating_yuri, rating_profanity); and approved and date_submitted and user_id also have their own btree indexes. In the picture_categories table pid and cat_id have their own btree indices plus one together. Full table definition: http://pastebin.ca/57219 the cat_id and rating values vary from query to query. The one listed above took 54 seconds in a test run just now. Here is explain analyze: http://pastebin.ca/57220 pictures is the interesting table here. It looks like the planner would do better to choose something other than a nested loop on it. Try running EXPLAIN ANALYZE on the query with enable_nestloop=off and see what you get (you'll need to compare it to what you get with enable_nestloop on to see what the change is). Both pictures and picture categories have about 287,000 rows This query needs to run in under about a second or it kills my site by clogging apache slots (apache maxes out at 256 and I can have several hundred people on my site at a time). How can I make it run faster? Server is a dual xeon with a gig of ram dedicated mostly to postgresql. Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222 I suspect the low work_mem may be why it's using a nested loop. In addition to the test above, it would be interesting to see what happens to the plan if you set work_mem to 1. To be honest, you're pushing things expecting a machine with only 1G to serve 300 active connections. How large is the database itself? I know hyperthreading is considered something that can slow down a server but with my very high concurancy (averages about 400-500 concurant users during peak hours) I am hoping the extra virtual CPUs wil help. Anyone have experance that says diferent at high concurancy? Best bet is to try it and see. Generally, people find HT hurts, but I recently saw it double the performance of pgbench on a windows XP machine, so it's possible that windows is just more clever about how to use it than linux is. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How can I make this query faster (resend)
Hi, Cstendis, Cstdenis wrote: Server is a dual xeon with a gig of ram dedicated mostly to postgresql. Here is the changed lines in my postgresql.conf: http://pastebin.ca/57222 3M is really low for a production server. Try using pg_pool and limiting it to about 30 or so backend connections, and then give them at least 30 megs of RAM each. This should also cut down the connection creation overhead. HTH, Markus -- Markus Schaber | Logical TrackingTracing International AG Dipl. Inf. | Software Development GIS Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How can this be?
On Fri, Sep 16, 2005 at 08:34:14PM -0500, Martin Nickel wrote: Hello all, Mostly Postgres makes sense to me. But now and then it does something that boggles my brain. Take the statements below. I have a table (agent) with 5300 rows. The primary key is agent_id. I can do SELECT agent_id FROM agent and it returns all PK values in less than half a second (dual Opteron box, 4G ram, SATA Raid 10 drive system). But when I do a DELETE on two rows with an IN statement, using the primary key index (as stated by EXPLAIN) it take almost 4 minutes. pg_stat_activity shows nine other connections, all idle. If someone can explain this to me it will help restore my general faith in order and consistancy in the universe. When you delete a row from agent PG needs to find any matching rows in office. Is office large? Is office(office_id) indexed? -- Executing query: DELETE FROM agent WHERE agent_id IN (15395, 15394); Query returned successfully: 2 rows affected, 224092 ms execution time. -- Executing query: EXPLAIN DELETE FROM agent WHERE agent_id IN (15395, 15394); Index Scan using agent2_pkey, agent2_pkey on agent (cost=0.00..7.27 rows=2 width=6) Index Cond: ((agent_id = 15395) OR (agent_id = 15394)) Here's my table CREATE TABLE agent ( agent_id int4 NOT NULL DEFAULT nextval('agent_id_seq'::text), office_id int4 NOT NULL, lastname varchar(25), firstname varchar(25), ...other columns... CONSTRAINT agent2_pkey PRIMARY KEY (agent_id), CONSTRAINT agent_office_fk FOREIGN KEY (office_id) REFERENCES office (office_id) ON UPDATE RESTRICT ON DELETE RESTRICT ) WITHOUT OIDS; Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] How can this be?
On Fri, 16 Sep 2005, Martin Nickel wrote: Hello all, Mostly Postgres makes sense to me. But now and then it does something that boggles my brain. Take the statements below. I have a table (agent) with 5300 rows. The primary key is agent_id. I can do SELECT agent_id FROM agent and it returns all PK values in less than half a second (dual Opteron box, 4G ram, SATA Raid 10 drive system). But when I do a DELETE on two rows with an IN statement, using the primary key index (as stated by EXPLAIN) it take almost 4 minutes. pg_stat_activity shows nine other connections, all idle. Are there any tables that reference agent or other triggers? My first guess would be that there's a foreign key check for something else that's referencing agent.agent_id for which an index scan isn't being used. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How can I speed up this function?
The function I have exits the loop when the count hits 100 yes, but the inner loop can push the count up as high as necessary to select all the statements for a transaction, so by the time it exits, the count could be much higher. I do want to limit the statements, but I want to get enough for complete transactions. David Gnanavel Shanmugam wrote: But in the function you are exiting the loop when the count hits 100. If you do not want to limit the statements then remove the limit clause from the query I've written. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 16:29:32 +1200 To: [EMAIL PROTECTED] Subject: Re: [PERFORM] How can I speed up this function? Hi Gnanavel, Thanks, but that will only return at most 100 statements. If there is a transaction with 110 statements then this will not return all the statements for that transaction. We need to make sure that the function returns all the statements for a transaction. Cheers David Gnanavel Shanmugam wrote: Merge the two select statements like this and try, SELECT t.trans_id as ID,s.id, s.transaction_id, s.table_name, s.op, s.data FROM pending_trans AS t join dbmirror.pending_statement AS s on (s.transaction_id=t.id) WHERE t.fetched = false order by t.trans_id,s.id limit 100; If the above query works in the way you want, then you can also do the update using the same. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 14:37:34 +1200 To: pgsql-performance@postgresql.org Subject: [PERFORM] How can I speed up this function? We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; Table Schemas: CREATE TABLE dbmirror.pending_trans ( trans_id oid NOT NULL, fetched bool DEFAULT false, CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id) ) WITHOUT OIDS; CREATE TABLE dbmirror.pending_statement ( id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text), transaction_id oid NOT NULL, table_name text NOT NULL, op char NOT NULL, data text NOT NULL, CONSTRAINT pending_statement_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE UNIQUE INDEX idx_stmt_tran_id_id ON dbmirror.pending_statement USING btree (transaction_id, id); Postgres 8.0.1 on Linux. Any Help would be greatly appreciated. Regards -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend -- David Mitchell Software Engineer Telogis -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How can I speed up this function?
But in the function you are exiting the loop when the count hits 100. If you do not want to limit the statements then remove the limit clause from the query I've written. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 16:29:32 +1200 To: [EMAIL PROTECTED] Subject: Re: [PERFORM] How can I speed up this function? Hi Gnanavel, Thanks, but that will only return at most 100 statements. If there is a transaction with 110 statements then this will not return all the statements for that transaction. We need to make sure that the function returns all the statements for a transaction. Cheers David Gnanavel Shanmugam wrote: Merge the two select statements like this and try, SELECT t.trans_id as ID,s.id, s.transaction_id, s.table_name, s.op, s.data FROM pending_trans AS t join dbmirror.pending_statement AS s on (s.transaction_id=t.id) WHERE t.fetched = false order by t.trans_id,s.id limit 100; If the above query works in the way you want, then you can also do the update using the same. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 14:37:34 +1200 To: pgsql-performance@postgresql.org Subject: [PERFORM] How can I speed up this function? We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; Table Schemas: CREATE TABLE dbmirror.pending_trans ( trans_id oid NOT NULL, fetched bool DEFAULT false, CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id) ) WITHOUT OIDS; CREATE TABLE dbmirror.pending_statement ( id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text), transaction_id oid NOT NULL, table_name text NOT NULL, op char NOT NULL, data text NOT NULL, CONSTRAINT pending_statement_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE UNIQUE INDEX idx_stmt_tran_id_id ON dbmirror.pending_statement USING btree (transaction_id, id); Postgres 8.0.1 on Linux. Any Help would be greatly appreciated. Regards -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How can I speed up this function?
What's wrong with Slony? David Mitchell wrote: We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; Table Schemas: CREATE TABLE dbmirror.pending_trans ( trans_id oid NOT NULL, fetched bool DEFAULT false, CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id) ) WITHOUT OIDS; CREATE TABLE dbmirror.pending_statement ( id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text), transaction_id oid NOT NULL, table_name text NOT NULL, op char NOT NULL, data text NOT NULL, CONSTRAINT pending_statement_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE UNIQUE INDEX idx_stmt_tran_id_id ON dbmirror.pending_statement USING btree (transaction_id, id); Postgres 8.0.1 on Linux. Any Help would be greatly appreciated. Regards ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How can I speed up this function?
Christopher Kings-Lynne wrote: What's wrong with Slony? Because it's not multi-master. Our mirroring package is. -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How can I speed up this function?
What's wrong with Slony? Because it's not multi-master. Our mirroring package is. I'm curious - how did you write a multi-master replication package in pgsql, when pgsql doesn't have 2 phase commits or any kind of distributed syncing or conflict resolution in a release version? Chris ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How can I speed up this function?
David Mitchell wrote: We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; David, I'm still a newbie and it may not affect performance but why are you aliasing the tables? Can you not simply use FOR transaction IN SELECT trans_id FROM pending_trans WHERE fetched = false ORDER BY trans_id LIMIT 50 and FOR statement IN SELECT id, transaction_id, table_name, op, data FROM dbmirror.pending_statement WHERE pending_statement.transaction_id = transaction.trans_id ORDER BY pending_statement.id I am pretty sure that the ORDER BY is slowing down both of these queries. Since you are going to go through the whole table eventually do you really need to sort the data at this point? -- Kind Regards, Keith ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] How can I speed up this function?
Christopher Kings-Lynne wrote: I'm curious - how did you write a multi-master replication package in pgsql, when pgsql doesn't have 2 phase commits or any kind of distributed syncing or conflict resolution in a release version? We didn't write it entirely in pgsql, there is a worker process that takes care of actually committing to the database. Cheers -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How can I speed up this function?
Hi Keith, Unfortunately, we must have those sorts. The statements within a transaction must be executed on the slave in the same order as they were on the master, and similarly, transactions must also go in the same order. As for aliasing the tables, that is just a remnant from previous versions of the code. Thanks David Keith Worthington wrote: I'm still a newbie and it may not affect performance but why are you aliasing the tables? Can you not simply use FOR transaction IN SELECT trans_id FROM pending_trans WHERE fetched = false ORDER BY trans_id LIMIT 50 and FOR statement IN SELECT id, transaction_id, table_name, op, data FROM dbmirror.pending_statement WHERE pending_statement.transaction_id = transaction.trans_id ORDER BY pending_statement.id I am pretty sure that the ORDER BY is slowing down both of these queries. Since you are going to go through the whole table eventually do you really need to sort the data at this point? -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How can I speed up this function?
Merge the two select statements like this and try, SELECT t.trans_id as ID,s.id, s.transaction_id, s.table_name, s.op, s.data FROM pending_trans AS t join dbmirror.pending_statement AS s on (s.transaction_id=t.id) WHERE t.fetched = false order by t.trans_id,s.id limit 100; If the above query works in the way you want, then you can also do the update using the same. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 14:37:34 +1200 To: pgsql-performance@postgresql.org Subject: [PERFORM] How can I speed up this function? We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; Table Schemas: CREATE TABLE dbmirror.pending_trans ( trans_id oid NOT NULL, fetched bool DEFAULT false, CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id) ) WITHOUT OIDS; CREATE TABLE dbmirror.pending_statement ( id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text), transaction_id oid NOT NULL, table_name text NOT NULL, op char NOT NULL, data text NOT NULL, CONSTRAINT pending_statement_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE UNIQUE INDEX idx_stmt_tran_id_id ON dbmirror.pending_statement USING btree (transaction_id, id); Postgres 8.0.1 on Linux. Any Help would be greatly appreciated. Regards -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] How can I speed up this function?
Hi Gnanavel, Thanks, but that will only return at most 100 statements. If there is a transaction with 110 statements then this will not return all the statements for that transaction. We need to make sure that the function returns all the statements for a transaction. Cheers David Gnanavel Shanmugam wrote: Merge the two select statements like this and try, SELECT t.trans_id as ID,s.id, s.transaction_id, s.table_name, s.op, s.data FROM pending_trans AS t join dbmirror.pending_statement AS s on (s.transaction_id=t.id) WHERE t.fetched = false order by t.trans_id,s.id limit 100; If the above query works in the way you want, then you can also do the update using the same. with regards, S.Gnanavel -Original Message- From: [EMAIL PROTECTED] Sent: Tue, 28 Jun 2005 14:37:34 +1200 To: pgsql-performance@postgresql.org Subject: [PERFORM] How can I speed up this function? We have the following function in our home grown mirroring package, but it isn't running as fast as we would like. We need to select statements from the pending_statement table, and we want to select all the statements for a single transaction (pending_trans) in one go (that is, we either select all the statements for a transaction, or none of them). We select as many blocks of statements as it takes to top the 100 statement limit (so if the last transaction we pull has enough statements to put our count at 110, we'll still take it, but then we're done). Here is our function: CREATE OR REPLACE FUNCTION dbmirror.get_pending() RETURNS SETOF dbmirror.pending_statement AS $BODY$ DECLARE count INT4; transaction RECORD; statement dbmirror.pending_statement; BEGIN count := 0; FOR transaction IN SELECT t.trans_id as ID FROM pending_trans AS t WHERE fetched = false ORDER BY trans_id LIMIT 50 LOOP update pending_trans set fetched = true where trans_id = transaction.id; FOR statement IN SELECT s.id, s.transaction_id, s.table_name, s.op, s.data FROM dbmirror.pending_statement AS s WHERE s.transaction_id = transaction.id ORDER BY s.id ASC LOOP count := count + 1; RETURN NEXT statement; END LOOP; IF count 100 THEN EXIT; END IF; END LOOP; RETURN; END;$BODY$ LANGUAGE 'plpgsql' VOLATILE; Table Schemas: CREATE TABLE dbmirror.pending_trans ( trans_id oid NOT NULL, fetched bool DEFAULT false, CONSTRAINT pending_trans_pkey PRIMARY KEY (trans_id) ) WITHOUT OIDS; CREATE TABLE dbmirror.pending_statement ( id oid NOT NULL DEFAULT nextval('dbmirror.statement_id_seq'::text), transaction_id oid NOT NULL, table_name text NOT NULL, op char NOT NULL, data text NOT NULL, CONSTRAINT pending_statement_pkey PRIMARY KEY (id) ) WITHOUT OIDS; CREATE UNIQUE INDEX idx_stmt_tran_id_id ON dbmirror.pending_statement USING btree (transaction_id, id); Postgres 8.0.1 on Linux. Any Help would be greatly appreciated. Regards -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend -- David Mitchell Software Engineer Telogis ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How can an index be larger than a table
David, What also seems weird to me is that the control table has some unique indexes created on it, but the data_upate_events table just has a unique constraint. Will postgres use an index in the background to enforce this constraint? If you somehow have a unique constraint without a unique index, something is seriously broken. I suspect hacking of system tables. Otherwise, it sounds like you have index bloat due to mass deletions. Run REINDEX, or, preferably, VACUUM FULL and then REINDEX. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] How can I make use of both CPUs in a dual processor
Alex wrote: Hi, we just got a new dual processor machine and I wonder if there is a way to utilize both processors. Our DB server is basically fully dedicated to postgres. (its a dual amd with 4gb mem.) I have a batch job that periodically loads about 8 million records into a table. for this I drop the indices, truncate the table, use the copy to insert the data, recreate the indices (4 indices), vacuum the table. That is all done through a perl batch job. While I am doing this, I noticed that only one CPU is really used. So here are my questions: Is there a way to utilize both CPUs For postgres, you get a max of 1 CPU per connection, so to use both, you need 2 CPU's. Is it possible to split up the import file and run 2 copy processes Is it possible to create 2 indices at the same time You'd want to be a little careful. Postgres uses work_mem for vacuum and index creation, so if you have 2 processes doing it, just make sure you aren't running out of RAM and going to swap. Would I actually gain anything from that, or is the bottleneck somewhere else ? More likely, the bottleneck would be disk I/O. Simply because it is almost always disk I/O. However, without knowing your configuration, how much CPU is used during the operation, etc, it's hard to say. (perl is a given here for the batch job) If anyone has some experience or ideas... any hints or help on this would be appreciated. Thanks Alex Sorry I wasn't a lot of help. You should probably post your postgres version, and more information about how much CPU load there is while your load is running. John =:- signature.asc Description: OpenPGP digital signature
Re: [PERFORM] How can I make use of both CPUs in a dual processor
Thanks John. Well as I mentioned. I have a Dual AMD Opteron 64 2.4ghz, 15k rpm SCSI Disks, 4GB of memory. Disks are pretty fast and memory should be more than enough. Currently we dont have many concurrent connections. I run PG 8.0.1 on Fedora Core 3 When I now run the batch job, one CPU runs in the 80-90% the other in 5-10% max. John A Meinel wrote: Alex wrote: Hi, we just got a new dual processor machine and I wonder if there is a way to utilize both processors. Our DB server is basically fully dedicated to postgres. (its a dual amd with 4gb mem.) I have a batch job that periodically loads about 8 million records into a table. for this I drop the indices, truncate the table, use the copy to insert the data, recreate the indices (4 indices), vacuum the table. That is all done through a perl batch job. While I am doing this, I noticed that only one CPU is really used. So here are my questions: Is there a way to utilize both CPUs For postgres, you get a max of 1 CPU per connection, so to use both, you need 2 CPU's. Is it possible to split up the import file and run 2 copy processes Is it possible to create 2 indices at the same time You'd want to be a little careful. Postgres uses work_mem for vacuum and index creation, so if you have 2 processes doing it, just make sure you aren't running out of RAM and going to swap. Would I actually gain anything from that, or is the bottleneck somewhere else ? More likely, the bottleneck would be disk I/O. Simply because it is almost always disk I/O. However, without knowing your configuration, how much CPU is used during the operation, etc, it's hard to say. (perl is a given here for the batch job) If anyone has some experience or ideas... any hints or help on this would be appreciated. Thanks Alex Sorry I wasn't a lot of help. You should probably post your postgres version, and more information about how much CPU load there is while your load is running. John =:- ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] How can I make use of both CPUs in a dual processor
Thanks John. Well as I mentioned. I have a Dual AMD Opteron 64 2.4ghz, 15k rpm SCSI Disks, 4GB of memory. Disks are pretty fast and memory should be more than enough. Currently we dont have many concurrent connections. I run PG 8.0.1 on Fedora Core 3 When I now run the batch job, one CPU runs in the 80-90% the other in 5-10% max. If possible, split up your job into two workloads that can be run concurrently. Open up two connections on the client, one for each workload. To be 100% sure they get delegated to separate processors, let the first connection start working before opening the second one (should be easy enough to test from the terminal)...this should pretty much guarantee your batch processes get delegated to different processors. The beauty of pg is that it lets the o/s handle things that should be the o/s's job...don't forget the bgwriter/stats collector processes are also in the mix. Your o/s is probably already doing a pretty good job delegating work already. Even with fast disks, batch data management jobs are rarely cpu bound, so you might not see much difference in the total run time (spitting your batch might actually increase the run time, or reduce responsiveness to other connections). Never hurts to test that though. Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] How can I make use of both CPUs in a dual processor
You can wait for processes to finish as follows: #launch 3 processes sh -c './build_indexes1.sh' PID1=$! sh -c './build_indexes2.sh' PID2=$! sh -c './build_indexes3.sh' PID3=$! # then wait $PID1 wait $PID2 wait $PID3 #continue My feeling is that doing so should generally reduce the overall processing time, but if there are contention problems then it could conceivably get much worse. regards Iain - Original Message - From: Alex [EMAIL PROTECTED] To: John A Meinel [EMAIL PROTECTED] Cc: pgsql-performance@postgresql.org Sent: Thursday, February 10, 2005 12:00 AM Subject: Re: [PERFORM] How can I make use of both CPUs in a dual processor Thanks John. Well as I mentioned. I have a Dual AMD Opteron 64 2.4ghz, 15k rpm SCSI Disks, 4GB of memory. Disks are pretty fast and memory should be more than enough. Currently we dont have many concurrent connections. I run PG 8.0.1 on Fedora Core 3 When I now run the batch job, one CPU runs in the 80-90% the other in 5-10% max. John A Meinel wrote: Alex wrote: Hi, we just got a new dual processor machine and I wonder if there is a way to utilize both processors. Our DB server is basically fully dedicated to postgres. (its a dual amd with 4gb mem.) I have a batch job that periodically loads about 8 million records into a table. for this I drop the indices, truncate the table, use the copy to insert the data, recreate the indices (4 indices), vacuum the table. That is all done through a perl batch job. While I am doing this, I noticed that only one CPU is really used. So here are my questions: Is there a way to utilize both CPUs For postgres, you get a max of 1 CPU per connection, so to use both, you need 2 CPU's. Is it possible to split up the import file and run 2 copy processes Is it possible to create 2 indices at the same time You'd want to be a little careful. Postgres uses work_mem for vacuum and index creation, so if you have 2 processes doing it, just make sure you aren't running out of RAM and going to swap. Would I actually gain anything from that, or is the bottleneck somewhere else ? More likely, the bottleneck would be disk I/O. Simply because it is almost always disk I/O. However, without knowing your configuration, how much CPU is used during the operation, etc, it's hard to say. (perl is a given here for the batch job) If anyone has some experience or ideas... any hints or help on this would be appreciated. Thanks Alex Sorry I wasn't a lot of help. You should probably post your postgres version, and more information about how much CPU load there is while your load is running. John =:- ---(end of broadcast)--- TIP 8: explain analyze is your friend ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] How can one see what queries are running withing a
On Sat, Jun 26, 2004 at 04:58:16PM +0800, Christopher Kings-Lynne wrote: Let see in contrib/ the application pg_who ... you will see the process, the queries, and the CPU ... ;o) Even easier: SELECT * FROM pg_stat_activity; As a superuser. Thanks! That works as needed! Best regards, Pascal Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] How can one see what queries are running withing a postgres instance?
Sorry It's not in the contrib folder of PostgreSQL ... but you will find it on gborg.postgresql.org ! Regards, Le vendredi 25 Juin 2004 22:51, Hervé Piedvache a écrit : Let see in contrib/ the application pg_who ... you will see the process, the queries, and the CPU ... ;o) Regards, Le vendredi 25 Juin 2004 21:37, P.A.M. van Dam a écrit : Hi! I'd like to know if there is a way to see what queries are running within a certain postgres instance and how much resources (cpu/memory) etc. they are using. Right now it's impossible to see what is happening within postgres when it's binaries are using 100% CPU. In Sybase there is a command which let's you view what 'processes' are running within the server and how much cpu (according to Sybase) they are using. It also provides you with a stored procedure to kill off some bad behaving queries. How can one do this within postgres? Thanks in advance! Best regards, Pascal ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bill Footcow ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] How can one see what queries are running withing a
Let see in contrib/ the application pg_who ... you will see the process, the queries, and the CPU ... ;o) Even easier: SELECT * FROM pg_stat_activity; As a superuser. Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] How can one see what queries are running withing a
Christopher Kings-Lynne [EMAIL PROTECTED] writes: Even easier: SELECT * FROM pg_stat_activity; But note you must enable stats_command_string to make this very useful. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] How can one see what queries are running withing a postgres instance?
Let see in contrib/ the application pg_who ... you will see the process, the queries, and the CPU ... ;o) Regards, Le vendredi 25 Juin 2004 21:37, P.A.M. van Dam a écrit : Hi! I'd like to know if there is a way to see what queries are running within a certain postgres instance and how much resources (cpu/memory) etc. they are using. Right now it's impossible to see what is happening within postgres when it's binaries are using 100% CPU. In Sybase there is a command which let's you view what 'processes' are running within the server and how much cpu (according to Sybase) they are using. It also provides you with a stored procedure to kill off some bad behaving queries. How can one do this within postgres? Thanks in advance! Best regards, Pascal ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Bill Footcow ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] How can I Improve performance in Solaris?
On Wed, 13 Aug 2003, ingrid martinez wrote: Floes table looks like this Table flows Column| Type | Modifiers --+--+--- flidload | bigint | not null firsttime| bigint | fldestpeeraddress| character varying(30)| fldesttransaddress | bigint | fldesttranstype | smallint | fldfromoctets| bigint | fldscodepoint| smallint | fldtooctets | bigint | flfrompdus | bigint | flid | text | flidrule | bigint | flsourcepeeraddress | character varying(30)| flsourcetransaddress | bigint | flsourcetranstype| smallint | fltime | timestamp with time zone | fltopdus | bigint | lasttime | bigint | sourceinterface | smallint | destinterface| smallint | sourceasn| smallint | destasn | smallint | Primary key: flows_pkey Which columns are in the pkey? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] How can I Improve performance in Solaris?
The query that execute is only inserts, I use a batch of 300 and then do commit. insert into FLOWS values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) and postgresql.conf looks like this # # PostgreSQL configuration file # - # # This file consists of lines of the form # # name = value # # (The `=' is optional.) White space is collapsed, comments are # introduced by `#' anywhere on a line. The complete list of option # names and allowed values can be found in the PostgreSQL # documentation. The commented-out settings shown in this file # represent the default values. # Any option can also be given as a command line switch to the # postmaster, e.g., 'postmaster -c log_connections=on'. Some options # can be changed at run-time with the 'SET' SQL command. # # # Connection Parameters # #tcpip_socket = false #ssl = false #max_connections = 32 #port = 5432 #hostname_lookup = false #show_source_port = false #unix_socket_directory = '' #unix_socket_group = '' #unix_socket_permissions = 0777 #virtual_host = '' #krb_server_keyfile = '' # # Shared Memory Size # #shared_buffers = 64# 2*max_connections, min 16 #max_fsm_relations = 100# min 10, fsm is free space map #max_fsm_pages = 1 # min 1000, fsm is free space map #max_locks_per_transaction = 64 # min 10 #wal_buffers = 8# min 4 # # Non-shared Memory Sizes # #sort_mem = 512 # min 32 #vacuum_mem = 8192 # min 1024 # # Write-ahead log (WAL) # #wal_files = 0 # range 0-64 #wal_sync_method = fsync # the default varies across platforms: # # fsync, fdatasync, open_sync, or open_datasync #wal_debug = 0 # range 0-16 #commit_delay = 0 # range 0-10 #commit_siblings = 5 # range 1-1000 #checkpoint_segments = 3 # in logfile segments (16MB each), min 1 #checkpoint_timeout = 300 # in seconds, range 30-3600 #fsync = true # # Optimizer Parameters # #enable_seqscan = true #enable_indexscan = true #enable_tidscan = true #enable_sort = true #enable_nestloop = true #enable_mergejoin = true #enable_hashjoin = true #ksqo = false #effective_cache_size = 1000 # default in 8k pages #random_page_cost = 4 #cpu_tuple_cost = 0.01 #cpu_index_tuple_cost = 0.001 #cpu_operator_cost = 0.0025 # # GEQO Optimizer Parameters # #geqo = true #geqo_selection_bias = 2.0 # range 1.5-2.0 #geqo_threshold = 11 #geqo_pool_size = 0# default based on #tables in query, range 128-1024 #geqo_effort = 1 #geqo_generations = 0 #geqo_random_seed = -1 # auto-compute seed # # Debug display # #silent_mode = false #log_connections = false #log_timestamp = false #log_pid = false #debug_level = 0 # range 0-16 #debug_print_query = false #debug_print_parse = false #debug_print_rewritten = false #debug_print_plan = false #debug_pretty_print = false # requires USE_ASSERT_CHECKING #debug_assertions = true # # Syslog # # requires ENABLE_SYSLOG #syslog = 0 # range 0-2 #syslog_facility = 'LOCAL0' #syslog_ident = 'postgres' # # Statistics # #show_parser_stats = false #show_planner_stats = false #show_executor_stats = false #show_query_stats = false # requires BTREE_BUILD_STATS #show_btree_build_stats = false # # Access statistics collection # #stats_start_collector = true #stats_reset_on_server_start = true #stats_command_string = false #stats_row_level = false #stats_block_level = false # # Lock Tracing # #trace_notify = false # requires LOCK_DEBUG #trace_locks = false #trace_userlocks = false #trace_lwlocks = false #debug_deadlocks = false #trace_lock_oidmin = 16384 #trace_lock_table = 0 # # Misc # #dynamic_library_path = '$libdir' #australian_timezones = false #authentication_timeout = 60# min 1, max 600 #deadlock_timeout = 1000 #default_transaction_isolation = 'read committed' #max_expr_depth = 1 # min 10 #max_files_per_process = 1000 # min 25 #password_encryption = false #sql_inheritance = true #transform_null_equals = false - Original Message - From: Andrew Sullivan [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Wednesday, August 13, 2003 9:32 AM Subject: Re: [PERFORM] How can I Improve performance in Solaris? On Wed, Aug 13, 2003 at 09:03:31AM -0500, ingrid martinez wrote: I want to know, how can I improve the performance of postgres, I have a java class thar inserts register every 30 min but is very slow. What does the query do? How is postgres configured? A -- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada [EMAIL PROTECTED] M2P 2A8 +1 416 646 3304 x110 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED
Re: [PERFORM] How can I Improve performance in Solaris?
More than likely you are suffering from an affliction known as type mismatch. This is listed as tip 9 here on the performance list (funny, it was sent at the bottom of your reply :-) What happens is that when you do: select * from some_table where id=123; where id is a bigint the query planner assumes you must want 123 cast to int4, which doesn't match int8 (aka bigint) and uses a sequential scan to access that row. I.e. it reads the whole table in. You can force the planner to do the right thing here in a couple of ways: select * from some_table where id=123::bigint; -- OR -- select * from some_table where id='123'; On Wed, 13 Aug 2003, ingrid martinez wrote: the primary key is flidload ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] How can I Improve performance in Solaris?
On Wed, Aug 13, 2003 at 10:17:45AM -0500, ingrid martinez wrote: The query that execute is only inserts, I use a batch of 300 and then do commit. insert into FLOWS values(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) Are there any foreign keys, c? and postgresql.conf looks like this [ . . .] The configuration is the default. You'll certainly want to increase the shared memory and fiddle with some of the other usual pieces. There is some discussion of the config file at http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html. Unless the INSERTs are causing SELECTs, though, I can't see what exactly might be causing you so much difficulty. A -- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada [EMAIL PROTECTED] M2P 2A8 +1 416 646 3304 x110 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] How can I Improve performance in Solaris?
the primary key is flidload - Original Message - From: scott.marlowe [EMAIL PROTECTED] To: ingrid martinez [EMAIL PROTECTED] Cc: Andrew Sullivan [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Wednesday, August 13, 2003 11:47 AM Subject: Re: [PERFORM] How can I Improve performance in Solaris? On Wed, 13 Aug 2003, ingrid martinez wrote: Floes table looks like this Table flows Column| Type | Modifiers --+--+--- flidload | bigint | not null firsttime| bigint | fldestpeeraddress| character varying(30)| fldesttransaddress | bigint | fldesttranstype | smallint | fldfromoctets| bigint | fldscodepoint| smallint | fldtooctets | bigint | flfrompdus | bigint | flid | text | flidrule | bigint | flsourcepeeraddress | character varying(30)| flsourcetransaddress | bigint | flsourcetranstype| smallint | fltime | timestamp with time zone | fltopdus | bigint | lasttime | bigint | sourceinterface | smallint | destinterface| smallint | sourceasn| smallint | destasn | smallint | Primary key: flows_pkey Which columns are in the pkey? ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match