Re: [GENERAL] Join Bad Performance on different data types

2014-03-05 Thread Victor Yegorov
2014-03-05 6:53 GMT+02:00 Adarsh Sharma eddy.ada...@gmail.com:

 Anyhow working on finding the root cause.


I would do the following:

1. Check your `default_statistics_target`, like:

SHOW default_statistics_target;

2. Your previously posted `EXPLAIN ANALYZE` with `set enable_seqscan=off`
shows
a significant skew in the estiamted number of rows. I would really bump
up the
`default_statistics_target` higher (modify your `postgresql.conf` and
reload),
up to 500, re-analyzed all tables and checked the performance again.

If you don't want to do it for the whole cluster, then increase for
individual columns.
I think increasing it up to 500 (or higher) for the columns containing
arrays will be
beneficial in any case:

ALTER TABLE graph5 ALTER detail_id SET STATISTICS 1000;
ALTER TABLE graph3 ALTER id_list  SET STATISTICS 1000;
ALTER TABLE graph2 ALTER targeting_ids SET STATISTICS 1000;

Of course, you need to `ANALYZE` (or `VACUUM ANALYZE`) the tables after
this change.

3. Next, I've reformatted your query the following way (easier for me):

SELECT DISTINCT
   glt.id || ':' || gtt.name as id_type,
   glt.name,latitude,longitude,radius,
   latitude || ',' || longitude as latlon
  FROM graph5 td JOIN graph6 gtt ON gtt.id=td.entity_type_id JOIN
graph2 gcr ON gcr.id=ANY(td.detail_id) JOIN graph3 gtd ON gtd.id=gcr.t_ids
JOIN graph1 glt ON glt.id=ANY(gtd.id_list);

I've ordered joins (well, tables) in the way they depend on each other.
I would try out to move the first join into the CTE in order to isolate
it's results
and help optimizer to better estiamte the rows. Something like this:

WITH first AS (
SELECT td.*,gtt.*
  FROM graph5 td
  JOIN graph6 gtt ON gtt.id=td.entity_type_id
)
SELECT DISTINCT
   glt.id || ':' || first.name as id_type,
   glt.name,latitude,longitude,radius,
   latitude || ',' || longitude as latlon
  FROM first
  JOIN graph2 gcr ON gcr.id=ANY(first.detail_id)
  JOIN graph3 gtd ON gtd.id=gcr.t_ids
  JOIN graph1 glt ON glt.id=ANY(gtd.id_list)

4. Try disabling materialization, like `set enable_material=off`.


-- 
Victor Y. Yegorov


[GENERAL] too-may-open-files log file entries when vauuming under solaris

2014-03-05 Thread Raschick, Hartmut
Dear all,
recently we have seen a lot of occurrences of out of file descriptors: Too 
many open files; release and retry in our postgres log files, every night when 
a vacuum full analyze is run. After some digging into the code we found that 
postgres potentially tries to open as many as a pre-determined maximum number 
of file descriptors when vacuuming. That number is the lesser of the one from 
the configuration file (max_files_per_process) and the one determined at 
start-up by src/backend/storage/file/fd.c::count_usable_fds(). Under Solaris 
now, it would seem, finding out that number via dup(0) is not sufficient, as 
the actual number of interest might be/is the number of usable stream file 
descriptors (up until Solaris 10, at least). Also, closing the last recently 
used file descriptor might therefore not solve a temporary problem (as 
something below 256 is needed). Now, this can be fixed by setting/leaving the 
descriptor limit at 256 or changing the postgresql.conf setting accordingly. 
Still, the function for determining the max number is not working as intended 
under Solaris, it would appear. One might try using fopen() instead of dup() or 
have a different handling for stream and normal file descriptors (including 
moving standard file descriptors to above 255 to leave room for stream ones). 
Maybe though, all this is not worth the  effort; then it might perhaps be a 
good idea to mention the limitations/specialties in the platform specific notes 
(e.g. have u/limit at 256 maximum).

cheers
hardy


Hartmut Raschick
Network Management Solutions
---
KEYMILE GmbH
Wohlenbergstr. 3
D-30175 Hannover, Germany

Phone: +49 (0)511 6747-564
Fax:   +49 (0)511 6747-777
mailto:hartmut.rasch...@keymile.com
http://www.keymile.com

 KEYMILE - because connectivity matters 

Geschäftsführer/Managing Directors: Björn Claaßen, Michael Breyer, Axel Föry - 
Rechtsform der Gesellschaft/Legal structure: GmbH, Sitz/Registered office: 
Hannover HRB 61069, Amtsgericht/Local court Hannover, USt-Id. Nr./VAT-Reg.-No.: 
DE 812282795, WEEE-Reg.-No.: DE 59336750



[GENERAL] Mysterious DB reset

2014-03-05 Thread Israel Brewster
I have a Postgresql 9.2.4 database containing real-time tracking data for our aircraft for the past week (theoretically). It is populated by two different processes: one that runs every few minutes, retrieving data from a number of sources and storing it in the DB, and one that has an "always on" connection to the DB streaming data into the database in realtime (often several records per second). To keep the database size manageable I have a cron job that runs every evening to delete all records that are more than a week old, after archiving a subset of them in permanent storage.This morning my boss e-mailed me, complaining that only a couple of aircraft were showing up in the list (SELECT distinct(tail) FROM data being the command that populates the list). Upon looking at the data I saw that it only went back to 4am this morning, rather than the week I was expecting. My first thought was "Oh, I must have a typo in my cleanup routine, such that it is deleting all records rather than only those a week old, and it's just that no one has noticed until now". So I looked at that, but changing the delete to a select appeared to produce the proper results, in that no records were selected:DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days';Then I noticed something even more odd. My database has an id column, which is defined as a SERIAL. As we all know, a serial is a monotonically increasing number that is not affected by deletes. However, the oldest record in my database, from 4am this morning, had an id of 1. Even though I KNOW there was data in the system yesterday. Even if my DELETE command was wrong and deleted ALL records, that shouldn't have reset the SERIAL column to 1! I also know that I have not been in the database mucking around with the sequence value - to be completely honest, I don't even know the exact command to reset it - I'd have to google it if I wanted to.Also odd is that my cleanup script runs at 1am. I have records of there being new data in the database up to 3:51am, but the oldest record currently in the DB is from 4:45am (as specified by the default of now() on the column). So I know records were added after my delete command ran, but before this reset occurred.So my question is, aside from someone going in and mucking about in the wee hours of the morning, what could possibly cause this behavior? What sort of event could cause all data to be deleted from the table, and the sequence to be reset? Especially while there is an active connection? Thanks for any ideas, however wild or off the wall :-)
---Israel BrewsterComputer SupportTechnician IIEra Alaska5245 Airport Industrial RdFairbanks, AK 99709(907) 450-7250 x7293---BEGIN:VCARD
VERSION:3.0
N:Brewster;Israel;;;
FN:Israel Brewster
ORG:Frontier Flying Service;MIS
TITLE:PC Support Tech II
EMAIL;type=INTERNET;type=WORK;type=pref:isr...@frontierflying.com
TEL;type=WORK;type=pref:907-450-7293
item1.ADR;type=WORK;type=pref:;;5245 Airport Industrial Wy;Fairbanks;AK;99701;
item1.X-ABADR:us
CATEGORIES:General
X-ABUID:36305438-95EA-4410-91AB-45D16CABCDDC\:ABPerson
END:VCARD




Re: [GENERAL] Mysterious DB reset

2014-03-05 Thread Alvaro Herrera
Israel Brewster wrote:

 So my question is, aside from someone going in and mucking about in the wee 
 hours of the morning, what could possibly cause this behavior? What sort of 
 event could cause all data to be deleted from the table, and the sequence to 
 be reset? Especially while there is an active connection? Thanks for any 
 ideas, however wild or off the wall :-)

Is this running off a NFS mount or something?  I'm wondering about the
filesystem getting unmounted and an empty copy of the database being in
the mountpoint.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Mysterious DB reset

2014-03-05 Thread Thom Brown
On 5 March 2014 18:22, Israel Brewster isr...@eraalaska.net wrote:

 I have a Postgresql 9.2.4 database containing real-time tracking data for
 our aircraft for the past week (theoretically). It is populated by two
 different processes: one that runs every few minutes, retrieving data from
 a number of sources and storing it in the DB, and one that has an always
 on connection to the DB streaming data into the database in realtime
 (often several records per second). To keep the database size manageable I
 have a cron job that runs every evening to delete all records that are more
 than a week old, after archiving a subset of them in permanent storage.

 This morning my boss e-mailed me, complaining that only a couple of
 aircraft were showing up in the list (SELECT distinct(tail) FROM data being
 the command that populates the list). Upon looking at the data I saw that
 it only went back to 4am this morning, rather than the week I was
 expecting. My first thought was Oh, I must have a typo in my cleanup
 routine, such that it is deleting all records rather than only those a week
 old, and it's just that no one has noticed until now. So I looked at that,
 but changing the delete to a select appeared to produce the proper results,
 in that no records were selected:

 DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7
 days';

 Then I noticed something even more odd. My database has an id column,
 which is defined as a SERIAL. As we all know, a serial is a monotonically
 increasing number that is not affected by deletes. However, the oldest
 record in my database, from 4am this morning, had an id of 1. Even though I
 KNOW there was data in the system yesterday. Even if my DELETE command was
 wrong and deleted ALL records, that shouldn't have reset the SERIAL column
 to 1! I also know that I have not been in the database mucking around with
 the sequence value - to be completely honest, I don't even know the exact
 command to reset it - I'd have to google it if I wanted to.

 Also odd is that my cleanup script runs at 1am. I have records of there
 being new data in the database up to 3:51am, but the oldest record
 currently in the DB is from 4:45am (as specified by the default of now() on
 the column). So I know records were added after my delete command ran, but
 before this reset occurred.

 So my question is, aside from someone going in and mucking about in the
 wee hours of the morning, what could possibly cause this behavior? What
 sort of event could cause all data to be deleted from the table, and the
 sequence to be reset? Especially while there is an active connection?
 Thanks for any ideas, however wild or off the wall :-)


That is odd.  Even if it were an unlogged table, and there was a crash, the
sequence wouldn't reset, and even if it was running in a very long-running
transaction held open by a buggy connection pooler, the sequence would
still progress as it's immune to the effects of transactions.

So if all the data went missing, and the sequence reset, the only thing I
can think of is:

Someone ran:

TRUNCATE data RESTART IDENTITY;

or someone restored the table structure from a backup that deleted the
original table.

Do you log DDL?

Was the table partitioned?

You should also really be on 9.2.7, although I can't think of any bug
that's been fixed which could be responsible for this issue.

-- 
Thom


Re: [GENERAL] too-may-open-files log file entries when vauuming under solaris

2014-03-05 Thread Tom Lane
Raschick, Hartmut hartmut.rasch...@keymile.com writes:
 recently we have seen a lot of occurrences of out of file descriptors:
 Too many open files; release and retry in our postgres log files, every
 night when a vacuum full analyze is run.  After some digging into the
 code we found that postgres potentially tries to open as many as a
 pre-determined maximum number of file descriptors when vacuuming. That
 number is the lesser of the one from the configuration file
 (max_files_per_process) and the one determined at start-up by
 src/backend/storage/file/fd.c::count_usable_fds(). Under Solaris now,
 it would seem, finding out that number via dup(0) is not sufficient, as
 the actual number of interest might be/is the number of usable stream
 file descriptors (up until Solaris 10, at least). Also, closing the last
 recently used file descriptor might therefore not solve a temporary
 problem (as something below 256 is needed). Now, this can be fixed by
 setting/leaving the descriptor limit at 256 or changing the
 postgresql.conf setting accordingly. Still, the function for determining
 the max number is not working as intended under Solaris, it would
 appear. One might try using fopen() instead of dup() or have a different
 handling for stream and normal file descriptors (including moving
 standard file descriptors to above 255 to leave room for stream
 ones). Maybe though, all this is not worth the effort; then it might
 perhaps be a good idea to mention the limitations/specialties in the
 platform specific notes (e.g. have u/limit at 256 maximum).

TBH this sounds like unfounded speculation.  AFAIK a Postgres backend will
not open anything but regular files after its initial startup.  I'm not
sure what a stream is on Solaris, but guessing that it refers to pipes
or sockets, I don't think we have a problem with an OS restriction that
those be below FD 256.  In any case, if we did, it would presumably show
up as errors not release-and-retry events.

Our usual experience is that you get release-and-retry log messages when
the OS is up against the system-wide open-file limit rather than the
per-process limit (ie, the underlying error code is ENFILE not EMFILE).
I don't know exactly how Solaris strerror() spells those codes so it's
difficult to tell from your reported log message which case is happening.
If it is the system-wide limit that's at issue, then of course the dup(0)
loop isn't likely to find it, and adjusting max_files_per_process (or
maybe better, reducing max_connections) is the expected solution.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Mysterious DB reset

2014-03-05 Thread Adrian Klaver

On 03/05/2014 10:22 AM, Israel Brewster wrote:

I have a Postgresql 9.2.4 database containing real-time tracking data
for our aircraft for the past week (theoretically). It is populated by
two different processes: one that runs every few minutes, retrieving
data from a number of sources and storing it in the DB, and one that has
an always on connection to the DB streaming data into the database in
realtime (often several records per second). To keep the database size
manageable I have a cron job that runs every evening to delete all
records that are more than a week old, after archiving a subset of them
in permanent storage.

This morning my boss e-mailed me, complaining that only a couple of
aircraft were showing up in the list (SELECT distinct(tail) FROM data
being the command that populates the list). Upon looking at the data I
saw that it only went back to 4am this morning, rather than the week I
was expecting. My first thought was Oh, I must have a typo in my
cleanup routine, such that it is deleting all records rather than only
those a week old, and it's just that no one has noticed until now. So I
looked at that, but changing the delete to a select appeared to produce
the proper results, in that no records were selected:


Well it would, if the records only go back to 4 AM this morning. In 
other words if no records exist before 4 AM today, no records exist 
before 7 days ago also or am I missing something?




DELETE FROM data WHERE pointtimenow() AT TIME ZONE 'UTC'-interval '7 days';

Then I noticed something even more odd. My database has an id column,
which is defined as a SERIAL. As we all know, a serial is a
monotonically increasing number that is not affected by deletes.
However, the oldest record in my database, from 4am this morning, had an
id of 1. Even though I KNOW there was data in the system yesterday. Even
if my DELETE command was wrong and deleted ALL records, that shouldn't
have reset the SERIAL column to 1! I also know that I have not been in
the database mucking around with the sequence value - to be completely
honest, I don't even know the exact command to reset it - I'd have to
google it if I wanted to.


A sequence is just a special table.

So what does SELECT * from the sequence show?



Also odd is that my cleanup script runs at 1am. I have records of there
being new data in the database up to 3:51am, but the oldest record
currently in the DB is from 4:45am (as specified by the default of now()
on the column). So I know records were added after my delete command
ran, but before this reset occurred.


I am not sure what you are calling the 'reset'?
Did something happen between 3:51 AM and 4:45 AM?
Also not sure why you call the 4:45 AM record the oldest, when you say 
you can identify records from 3:51 AM?




So my question is, aside from someone going in and mucking about in the
wee hours of the morning, what could possibly cause this behavior? What
sort of event could cause all data to be deleted from the table, and the
sequence to be reset? Especially while there is an active connection?
Thanks for any ideas, however wild or off the wall :-)


What is in the Postgres/system logs for the time period(s) you mention?



---
Israel Brewster
Computer Support Technician II
Era Alaska
5245 Airport Industrial Rd
Fairbanks, AK 99709
(907) 450-7250 x7293
---







--
Adrian Klaver
adrian.kla...@aklaver.com


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] execute table query in backend

2014-03-05 Thread Amit Langote
On Tue, Mar 4, 2014 at 12:38 AM, Anh Pham atpham@gmail.com wrote:
 Hi,
 I am trying to extend the server backend by writing a new module.
 Basically, it tries to retrieve tuples from a specific table using some
 predefined qualifications (equivalent to SELECT FROM WHERE client sql
 statement ).
 Is there any quick or efficient way to do this?


Assuming you're trying to write a server extension (and NOT a client
application), this is where you can start:

http://www.postgresql.org/docs/9.3/static/server-programming.html

Specifically, an interface called server programming interface
(chapter 44 on the above page) can be used to issue SQLs from a server
extension module. Read more about it here (documentation also includes
few examples to get started with):

http://www.postgresql.org/docs/9.3/static/spi.html

--
Amit


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general