Re: [GENERAL] Join Bad Performance on different data types
2014-03-05 6:53 GMT+02:00 Adarsh Sharma eddy.ada...@gmail.com: Anyhow working on finding the root cause. I would do the following: 1. Check your `default_statistics_target`, like: SHOW default_statistics_target; 2. Your previously posted `EXPLAIN ANALYZE` with `set enable_seqscan=off` shows a significant skew in the estiamted number of rows. I would really bump up the `default_statistics_target` higher (modify your `postgresql.conf` and reload), up to 500, re-analyzed all tables and checked the performance again. If you don't want to do it for the whole cluster, then increase for individual columns. I think increasing it up to 500 (or higher) for the columns containing arrays will be beneficial in any case: ALTER TABLE graph5 ALTER detail_id SET STATISTICS 1000; ALTER TABLE graph3 ALTER id_list SET STATISTICS 1000; ALTER TABLE graph2 ALTER targeting_ids SET STATISTICS 1000; Of course, you need to `ANALYZE` (or `VACUUM ANALYZE`) the tables after this change. 3. Next, I've reformatted your query the following way (easier for me): SELECT DISTINCT glt.id || ':' || gtt.name as id_type, glt.name,latitude,longitude,radius, latitude || ',' || longitude as latlon FROM graph5 td JOIN graph6 gtt ON gtt.id=td.entity_type_id JOIN graph2 gcr ON gcr.id=ANY(td.detail_id) JOIN graph3 gtd ON gtd.id=gcr.t_ids JOIN graph1 glt ON glt.id=ANY(gtd.id_list); I've ordered joins (well, tables) in the way they depend on each other. I would try out to move the first join into the CTE in order to isolate it's results and help optimizer to better estiamte the rows. Something like this: WITH first AS ( SELECT td.*,gtt.* FROM graph5 td JOIN graph6 gtt ON gtt.id=td.entity_type_id ) SELECT DISTINCT glt.id || ':' || first.name as id_type, glt.name,latitude,longitude,radius, latitude || ',' || longitude as latlon FROM first JOIN graph2 gcr ON gcr.id=ANY(first.detail_id) JOIN graph3 gtd ON gtd.id=gcr.t_ids JOIN graph1 glt ON glt.id=ANY(gtd.id_list) 4. Try disabling materialization, like `set enable_material=off`. -- Victor Y. Yegorov
[GENERAL] too-may-open-files log file entries when vauuming under solaris
Dear all, recently we have seen a lot of occurrences of out of file descriptors: Too many open files; release and retry in our postgres log files, every night when a vacuum full analyze is run. After some digging into the code we found that postgres potentially tries to open as many as a pre-determined maximum number of file descriptors when vacuuming. That number is the lesser of the one from the configuration file (max_files_per_process) and the one determined at start-up by src/backend/storage/file/fd.c::count_usable_fds(). Under Solaris now, it would seem, finding out that number via dup(0) is not sufficient, as the actual number of interest might be/is the number of usable stream file descriptors (up until Solaris 10, at least). Also, closing the last recently used file descriptor might therefore not solve a temporary problem (as something below 256 is needed). Now, this can be fixed by setting/leaving the descriptor limit at 256 or changing the postgresql.conf setting accordingly. Still, the function for determining the max number is not working as intended under Solaris, it would appear. One might try using fopen() instead of dup() or have a different handling for stream and normal file descriptors (including moving standard file descriptors to above 255 to leave room for stream ones). Maybe though, all this is not worth the effort; then it might perhaps be a good idea to mention the limitations/specialties in the platform specific notes (e.g. have u/limit at 256 maximum). cheers hardy Hartmut Raschick Network Management Solutions --- KEYMILE GmbH Wohlenbergstr. 3 D-30175 Hannover, Germany Phone: +49 (0)511 6747-564 Fax: +49 (0)511 6747-777 mailto:hartmut.rasch...@keymile.com http://www.keymile.com KEYMILE - because connectivity matters Geschäftsführer/Managing Directors: Björn Claaßen, Michael Breyer, Axel Föry - Rechtsform der Gesellschaft/Legal structure: GmbH, Sitz/Registered office: Hannover HRB 61069, Amtsgericht/Local court Hannover, USt-Id. Nr./VAT-Reg.-No.: DE 812282795, WEEE-Reg.-No.: DE 59336750
[GENERAL] Mysterious DB reset
I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an "always on" connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage.This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was "Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now". So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected:DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days';Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to.Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred.So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-) ---Israel BrewsterComputer SupportTechnician IIEra Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7250 x7293---BEGIN:VCARD VERSION:3.0 N:Brewster;Israel;;; FN:Israel Brewster ORG:Frontier Flying Service;MIS TITLE:PC Support Tech II EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com TEL;type=WORK;type=pref:907-450-7293 item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701; item1.X-ABADR:us CATEGORIES:General X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson END:VCARD
Re: [GENERAL] Mysterious DB reset
Israel Brewster wrote: So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-) Is this running off a NFS mount or something? I'm wondering about the filesystem getting unmounted and an empty copy of the database being in the mountpoint. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Mysterious DB reset
On 5 March 2014 18:22, Israel Brewster isr...@eraalaska.net wrote: I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an always on connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage. This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now. So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected: DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days'; Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to. Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred. So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-) That is odd. Even if it were an unlogged table, and there was a crash, the sequence wouldn't reset, and even if it was running in a very long-running transaction held open by a buggy connection pooler, the sequence would still progress as it's immune to the effects of transactions. So if all the data went missing, and the sequence reset, the only thing I can think of is: Someone ran: TRUNCATE data RESTART IDENTITY; or someone restored the table structure from a backup that deleted the original table. Do you log DDL? Was the table partitioned? You should also really be on 9.2.7, although I can't think of any bug that's been fixed which could be responsible for this issue. -- Thom
Re: [GENERAL] too-may-open-files log file entries when vauuming under solaris
Raschick, Hartmut hartmut.rasch...@keymile.com writes: recently we have seen a lot of occurrences of out of file descriptors: Too many open files; release and retry in our postgres log files, every night when a vacuum full analyze is run. After some digging into the code we found that postgres potentially tries to open as many as a pre-determined maximum number of file descriptors when vacuuming. That number is the lesser of the one from the configuration file (max_files_per_process) and the one determined at start-up by src/backend/storage/file/fd.c::count_usable_fds(). Under Solaris now, it would seem, finding out that number via dup(0) is not sufficient, as the actual number of interest might be/is the number of usable stream file descriptors (up until Solaris 10, at least). Also, closing the last recently used file descriptor might therefore not solve a temporary problem (as something below 256 is needed). Now, this can be fixed by setting/leaving the descriptor limit at 256 or changing the postgresql.conf setting accordingly. Still, the function for determining the max number is not working as intended under Solaris, it would appear. One might try using fopen() instead of dup() or have a different handling for stream and normal file descriptors (including moving standard file descriptors to above 255 to leave room for stream ones). Maybe though, all this is not worth the effort; then it might perhaps be a good idea to mention the limitations/specialties in the platform specific notes (e.g. have u/limit at 256 maximum). TBH this sounds like unfounded speculation. AFAIK a Postgres backend will not open anything but regular files after its initial startup. I'm not sure what a stream is on Solaris, but guessing that it refers to pipes or sockets, I don't think we have a problem with an OS restriction that those be below FD 256. In any case, if we did, it would presumably show up as errors not release-and-retry events. Our usual experience is that you get release-and-retry log messages when the OS is up against the system-wide open-file limit rather than the per-process limit (ie, the underlying error code is ENFILE not EMFILE). I don't know exactly how Solaris strerror() spells those codes so it's difficult to tell from your reported log message which case is happening. If it is the system-wide limit that's at issue, then of course the dup(0) loop isn't likely to find it, and adjusting max_files_per_process (or maybe better, reducing max_connections) is the expected solution. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Mysterious DB reset
On 03/05/2014 10:22 AM, Israel Brewster wrote: I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an always on connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage. This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now. So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected: Well it would, if the records only go back to 4 AM this morning. In other words if no records exist before 4 AM today, no records exist before 7 days ago also or am I missing something? DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days'; Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to. A sequence is just a special table. So what does SELECT * from the sequence show? Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred. I am not sure what you are calling the 'reset'? Did something happen between 3:51 AM and 4:45 AM? Also not sure why you call the 4:45 AM record the oldest, when you say you can identify records from 3:51 AM? So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-) What is in the Postgres/system logs for the time period(s) you mention? --- Israel Brewster Computer Support Technician II Era Alaska 5245 Airport Industrial Rd Fairbanks, AK 99709 (907) 450-7250 x7293 --- -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] execute table query in backend
On Tue, Mar 4, 2014 at 12:38 AM, Anh Pham atpham@gmail.com wrote: Hi, I am trying to extend the server backend by writing a new module. Basically, it tries to retrieve tuples from a specific table using some predefined qualifications (equivalent to SELECT FROM WHERE client sql statement ). Is there any quick or efficient way to do this? Assuming you're trying to write a server extension (and NOT a client application), this is where you can start: http://www.postgresql.org/docs/9.3/static/server-programming.html Specifically, an interface called server programming interface (chapter 44 on the above page) can be used to issue SQLs from a server extension module. Read more about it here (documentation also includes few examples to get started with): http://www.postgresql.org/docs/9.3/static/spi.html -- Amit -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general