Re: *SOLVED* [GENERAL] Connections closing due to "terminating connection due to administrator command"
> On 27 ביולי 2015, at 18:20, Tom Lane wrote: > > Herouth Maoz writes: >> So I’m left with the question of what caused the shutdown on July 21st. > > Well, you had > > 2015-07-21 15:37:59 IDT LOG: received fast shutdown request > > There is exactly one place in the Postgres code that prints that message, > and it is the postmaster's SIGINT handler. > > 2015-07-21 15:37:59 IDT LOG: aborting any active transactions > > This is just the postmaster noting that it's about to send SIGTERM signals > to all its child processes ... > > 2015-07-21 15:37:59 IDT FATAL: terminating connection due to administrator > command > > ... and here is a child acknowledging receipt of SIGTERM. This is all > as-expected once the postmaster's received a shutdown signal. > > So something sent the postmaster a SIGINT, and it's hard to believe that > the source wasn't external to the database. OK, that was straight to the point, so I started looking for anything that could have done that in the system logs. As it turns out, it was human error. That is, the sysadmin ran "apt-get upgrade", not being aware that if one of the packages upgraded was PostgreSQL, it would cause a database restart. Thanks everybody for your time. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Connections closing due to "terminating connection due to administrator command"
> On 27 ביולי 2015, at 18:01, Adrian Klaver wrote: > Not sure what yo have set up for logging, but you might to crank it up. 13 > days between entries for a system that is in use all the time seems sort of > light to me. Most of the log settings are just the Debian default (except the log prefix). This system is one of several we have, but it’s intended for quick processing of messages by some of our customers that send low volumes but need quick response. As such, it processes around 50,000 messages a day, which boils down to around 200,000 or so database requests a day. It’s very light load, so we don’t get many timeouts or other load-related log entries. I don’t want to log each query, though, as that seems wasteful to me. > >> >> NEW INFORMATION: I believe that today’s mishap is related to the July 21 >> shutdown problem, based on the logs of our Java program and an inquiry into >> the way it handles messages that go through a particular path. It seems that >> one of the modules has a bug and retains stale connections, and since >> messages that arrive through that path are rare, the first time one came >> after the July 21st shutdown was today. > > So what is the module doing? > > Or more the point what is the application doing to deal with the stale > connections? The module keeps a connection pool available to the threads that call it, one connection per thread. It runs a JDBC prepared statement. If that prepared statement hits an SQLException, it then logs it in two places and attempts to recreate the connection, which is where the bug is. But that bug is just the reason why I didn’t get a fresh connection afterwards, it’s not the reason why the SQLException happened to begin with (which is that the connection kept for this particular thread was from before the shut down). I hope this information helps, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Connections closing due to "terminating connection due to administrator command"
> On 27 ביולי 2015, at 16:55, Melvin Davidson wrote: > > If you are running Linux (please ALWAYS give the O/S ), then this could have > been caused by the sys admin doing a system shutdown. Yes, sorry about that, as I previously answered Adrian Klaver, the OS is Debian Gnu/Linux 7. But I did mention that the machine has not been restarted (it currently has a 45 day uptime). And the sysadmin says he didn’t do anything on that machine on that day. > > Otherwise, anyone that can sudo to postgres could also have potentially > issued a > pg_ctl stop. The only people with access to that machine are myself (through ssh to the postgres account) and the sysadmin. That machine is basically a dedicated database server. > > I strongly suggest you review the _system_ logs for a possible hint as to > whom could have done that. Also, consider adding the user name into the log > prefix > ( # %u = user name ) I’ve asked the sysadmin to look at the system logs. I’ll consider the user name suggestion. Thank you. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Connections closing due to "terminating connection due to administrator command"
> On 27 ביולי 2015, at 16:39, Adrian Klaver wrote: >> >> * Given that I did not terminate any backend connection interactively, >>why did I get a "terminating connection due to administrator >>command” message? Is there any situation where this message is >>issued without the administrator being involved? > > This error message comes from intervention by a program external to Postgres: > > http://www.postgresql.org/message-id/4564.1284559...@sss.pgh.pa.us > > So what OS and version are you running? OK, that’s a miss on my part, sorry: The OS is Debian GNU/Linux 7. > > What does the Java app do besides accessing the database? Well, it’s a message-passing application that basically up all the time, processing information sent in by our clients and distributing it. It is not running on the same machine as the database, it is not using a privileged database account (the user is the database’s owner, but not a super-user), and it’s not even running very complex queries. If more detailed information is needed, I’ll provide it as needed, as a full description of the system is going to be very lengthy, and I doubt it will be helpful, given that it’s not really capable of sending signals over to the database process, being on a separate machine. > > Do you have other programs that monitor/control the database? We have a backup script that runs at 4:02AM every day. Other than that, we just monitor the java program, so if the database fails, it shows up in that program’s log. > > >> * What could have caused the shutdown on the 21st of July? Again, I >>did not issue a shutdown request for that server, and the machine >>didn’t restart. > > Is there anything in logs before the above that might give a clue? No, that’s the thing. It’s the first entry in the log for days - the previous log entry is from July 8th. NEW INFORMATION: I believe that today’s mishap is related to the July 21 shutdown problem, based on the logs of our Java program and an inquiry into the way it handles messages that go through a particular path. It seems that one of the modules has a bug and retains stale connections, and since messages that arrive through that path are rare, the first time one came after the July 21st shutdown was today. So I’m left with the question of what caused the shutdown on July 21st. Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Connections closing due to "terminating connection due to administrator command"
Hello everybody. In the past week, it has happened to us twice already that we got an exception from our Java application, due to PostgreSQL "terminating connection due to administrator command”. The problem is that I’m the administrator, and I issued no such command. On the first opportunity that it happened, the PostgreSQL server actually seemed to have spontaneously shut itself down and restarted: 2015-07-21 15:37:59 IDT LOG: received fast shutdown request 2015-07-21 15:37:59 IDT LOG: aborting any active transactions 2015-07-21 15:37:59 IDT FATAL: terminating connection due to administrator command … 2015-07-21 15:38:01 IDT LOG: shutting down 2015-07-21 15:38:02 IDT LOG: database system is shut down 2015-07-21 15:40:16 IDT LOG: database system was shut down at 2015-07-21 15:38:02 IDT On today’s occasion, the server seems to be continuing as usual, and there is nothing whatsoever in the log. So my questions are: * Given that I did not terminate any backend connection interactively, why did I get a "terminating connection due to administrator command” message? Is there any situation where this message is issued without the administrator being involved? * What could have caused the shutdown on the 21st of July? Again, I did not issue a shutdown request for that server, and the machine didn’t restart. The server is running PostgreSQL 9.1.16. Thank you, Herouth
Re: [GENERAL] Partitioning of a dependent table not based on date
On 01/12/2014, at 19:26, Andy Colson wrote: > On 12/1/2014 11:14 AM, Herouth Maoz wrote: >> I am currently in the process of creating a huge archive database that >> contains data from all of our systems, going back for almost a decade. >> >> Most of the tables fall into one of two categories: >> >> 1. Static tables, which are rarely updated, such as lookup tables or >> user lists. I don't intend to partition these, I'll just refresh them >> periodically from production. >> 2. Transaction tables, that have a timestamp field, for which I have the >> data archived in COPY format by month. Of course a monolithic table over >> a decade is not feasible, so I am partitioning these by month. >> >> (I don't mean "transaction" in the database sense, but in the sense that >> the data represents historical activity, e.g. message sent, file >> downloaded etc.) >> >> I have one table, though, that doesn't fall into this pattern. It's a >> many-to-one table relating to one of the transaction tables. So on one >> hand, it doesn't have a time stamp field, and on the other hand, it has >> accumulated lots of data over the last decade so I can't keep it >> unpartitioned. >> > > Lets stop here. One big table with lots of rows (and a good index) isn't a > problem. As long as you are not table scanning everything, there isn't a > reason to partition the table. > > Lots and lots of rows isnt a problem except for a few usage patterns: > 1) delete from bigtable where (some huge percent of the rows) > 2) select * from bigtable where (lots and lots of table scanning and cant > really index) > > If your index is selective enough, you'll be fine. Hmm. I suppose you're right. I planned the whole partition thing in the first place because most of my "transaction" tables are still alive so I'll need to continue bulk-inserting data every month, and inserting into a fresh partition is better than into a huge table. But in this case, since we have stopped working on this application in January, there will be no fresh inserts so it's not as important. We just need the archive for legal purposes. One thing, though: I noticed on my other system (a reports system, that holds a year's worth of data) that after I have partitioned the largest tables, backup time dropped. I suppose pg_dump of a single huge table takes is not as fast as pg_dump of multiple smaller ones. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] (Solved) Decreasing performance in table partitioning
Thank you. My solution is based on your suggestion, but instead of creating a new partition table and inserting into it, I create partitions as I originally planned, under the existing table, and insert to them. But without deleting. I use INSERT INTO... SELECT ONLY, without deleting, and so the data becomes duplicated (each row is both in the main table and in the partition). At the end of the loop, when all partitions are populated and indexed, I use TRUNCATE ONLY on the main table, and drop its indexes. This way, the views are not affected. So thank you for your suggestion and your help. Now for a comparable-size table, run time was less than 2 hours which is quite acceptable. On 11/09/2014, at 07:26, Huang, Suya wrote: > The views will go with the table. if you rename table, view definition will > be automatically changed accordingly. In your situation, you may need to > recreate views or other objects have dependency on that old table. > > But functions will remain the same, so as long as your new table has been > renamed to the same name as the old table, it should be ok. > > Note, it’s tested in my 9.3 environment, not sure how it behaves in older > versions… > > From: Herouth Maoz [mailto:hero...@unicell.co.il] > Sent: Wednesday, September 10, 2014 6:26 PM > To: Huang, Suya > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Decreasing performance in table partitioning > > Thank you. Sorry I have been away for a few days and couldn't thank you > before. > > Wouldn't this have an impact if there are things like views or functions > based on the old table? > > On 08/09/2014, at 04:57, Huang, Suya wrote: > > > Instead of deleting from the original non-partition table which is not > efficient, you can try below approach. > > Put below logic in a function as you like: > > Create a new partition table. > Insert data from original non-partition table to the correct partition of new > partition table. > Build index and analyze as needed. > Rename old non-partition table to something else. > Rename new partition table to the correct name as you wanted. > > Drop old non-partition table if you’re satisfied with current table structure. > > Thanks, > Suya > From: pgsql-general-ow...@postgresql.org > [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Herouth Maoz > Sent: Monday, September 08, 2014 12:00 AM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Decreasing performance in table partitioning > > > Hello all. > > I have created a function that partitions a large table into monthly > partitions. Since the name of the table, target schema for partitions, name > of the date field etc. are all passed as strings, the function is heavily > based on EXECUTE statements. > > My problem is the main loop, in which data for one month is moved from the > old table to the partition table. > > (1) > EXECUTE FORMAT ( > 'WITH del AS ( > DELETE FROM %1$I.%2$I > WHERE %3$I >= %4$L AND %3$I < %5$L > RETURNING * > ) > INSERT INTO %6$I.%7$I > SELECT * FROM del', > p_main_schema, > p_table_name, > p_date_field_name, > v_curr_month_str, > v_curr_month_to_str, > p_partition_schema, > v_partition_name > ); > > In the first few iterations, this runs in very good times. But as iterations > progress, performance drops, despite the size of the date for each month > being more or less the same. Eventually I end up with iterations that run for > hours, when I started with only a few minutes. The odd thing is that the last > iteration, which is actually for a month not yet inserted into that table (0 > records to move) it took 6 hours for the above statement to run! > > I tried to improve this, by first testing whether there are any records for > the current month in the table, adding: > > (2) > > EXECUTE FORMAT ( > 'SELECT true > FROM %1$I.%2$I > WHERE %3$I >= %4$L AND %3$I < %5$L > LIMIT 1', > p_main_schema, > p_table_name, > p_date_field_name, > v_curr_month_str, > v_curr_month_to_str > ) INTO v_exists; > > Before the above statement, and putting it in an IF statement on v_exists. > Also, after each move, I added: > > EXECUTE FORMAT ( 'ANALYZE %I.%I', p_main_schema, p_table_name ); > > Bu
Re: [GENERAL] Decreasing performance in table partitioning
Thank you. I was away for a few days. This is PG version 9.1. Now, this is in a function. As far as I understand, every function is a single transaction. I have not created exception blocks because I don't have any special handling for exceptions. I'm fine with the default. The data in each table is for about 10 months, so it looks about 10 times each cycle. What has occured to me, though, is that maybe I should write the DELETE statement as DELETE FROM ONLY, as the previously created partitions would be scanned, despite having no applicable data, the way I wrote it. Does that make sense? On 07/09/2014, at 19:50, Tom Lane wrote: > Herouth Maoz writes: >> My problem is the main loop, in which data for one month is moved from the >> old table to the partition table. > >>EXECUTE FORMAT ( >>'WITH del AS ( >> DELETE FROM %1$I.%2$I >> WHERE %3$I >= %4$L AND %3$I < %5$L >> RETURNING * >> ) >> INSERT INTO %6$I.%7$I >> SELECT * FROM del', >>p_main_schema, >>p_table_name, >>p_date_field_name, >>v_curr_month_str, >>v_curr_month_to_str, >>p_partition_schema, >>v_partition_name >>); > >> In the first few iterations, this runs in very good times. But as >> iterations progress, performance drops, despite the size of the date for >> each month being more or less the same. > > How many of these are you doing in a single transaction? Are you doing > them in separate exception blocks? What PG version is this exactly? > > My guess is that the cycles are going into finding out that tuples deleted > by a prior command are in fact dead to the current command (though still > live to outside observers, so they can't be hinted as dead). That ought > to be relatively cheap if it's all one subtransaction, but if there were a > large number of separate subtransactions involved, maybe not so much. > > regards, tom lane -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
Re: [GENERAL] Decreasing performance in table partitioning
Thank you. Sorry I have been away for a few days and couldn't thank you before. Wouldn't this have an impact if there are things like views or functions based on the old table? On 08/09/2014, at 04:57, Huang, Suya wrote: > Instead of deleting from the original non-partition table which is not > efficient, you can try below approach. > > Put below logic in a function as you like: > > Create a new partition table. > Insert data from original non-partition table to the correct partition of new > partition table. > Build index and analyze as needed. > Rename old non-partition table to something else. > Rename new partition table to the correct name as you wanted. > > Drop old non-partition table if you’re satisfied with current table structure. > > Thanks, > Suya > From: pgsql-general-ow...@postgresql.org > [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Herouth Maoz > Sent: Monday, September 08, 2014 12:00 AM > To: pgsql-general@postgresql.org > Subject: [GENERAL] Decreasing performance in table partitioning > > > Hello all. > > I have created a function that partitions a large table into monthly > partitions. Since the name of the table, target schema for partitions, name > of the date field etc. are all passed as strings, the function is heavily > based on EXECUTE statements. > > My problem is the main loop, in which data for one month is moved from the > old table to the partition table. > > (1) > EXECUTE FORMAT ( > 'WITH del AS ( > DELETE FROM %1$I.%2$I > WHERE %3$I >= %4$L AND %3$I < %5$L > RETURNING * > ) > INSERT INTO %6$I.%7$I > SELECT * FROM del', > p_main_schema, > p_table_name, > p_date_field_name, > v_curr_month_str, > v_curr_month_to_str, > p_partition_schema, > v_partition_name > ); > > In the first few iterations, this runs in very good times. But as iterations > progress, performance drops, despite the size of the date for each month > being more or less the same. Eventually I end up with iterations that run for > hours, when I started with only a few minutes. The odd thing is that the last > iteration, which is actually for a month not yet inserted into that table (0 > records to move) it took 6 hours for the above statement to run! > > I tried to improve this, by first testing whether there are any records for > the current month in the table, adding: > > (2) > > EXECUTE FORMAT ( > 'SELECT true > FROM %1$I.%2$I > WHERE %3$I >= %4$L AND %3$I < %5$L > LIMIT 1', > p_main_schema, > p_table_name, > p_date_field_name, > v_curr_month_str, > v_curr_month_to_str > ) INTO v_exists; > > Before the above statement, and putting it in an IF statement on v_exists. > Also, after each move, I added: > > EXECUTE FORMAT ( 'ANALYZE %I.%I', p_main_schema, p_table_name ); > > But to no avail. In fact, in each iteration, the execution of statement 2 > above takes more and more time. > > Here is the number of rows in each month for the table I was trying to > partition: > > count | the_month > --+- > 10907117 | 2013-08-01 00:00:00 > 12715234 | 2013-09-01 00:00:00 > 14902928 | 2013-10-01 00:00:00 > 10933566 | 2013-11-01 00:00:00 > 11394906 | 2013-12-01 00:00:00 > 9181051 | 2014-01-01 00:00:00 > 8487028 | 2014-02-01 00:00:00 > 9892981 | 2014-03-01 00:00:00 > 8830191 | 2014-04-01 00:00:00 > 8368638 | 2014-05-01 00:00:00 > 8014685 | 2014-06-01 00:00:00 > 6780589 | 2014-07-01 00:00:00 > > > And the times for each iteration: > > MonthStatement 2 Statement 1 > 2013-08 3 sec3 min > 2013-09 2 min17 min > 2013-10 4 min21 min > 2013-11 8 min20 min > 2013-12 9 min32 min > 2014-01 16 min 21 min > 2014-02 19 min 20 min > 2014-03 14 min 23 min > > For April I had to cancel it in the middle. My problem is that I can't let > this run into the evening, when we have backup followed by large data > collection. These times are just for the given statements, and additional > time is spent creating indexes on the partitions and so on. So this thing ran > from 11:24 until I had to cancel it at around 6PM. > > Can anybody explain the performance deterioration and/or offer a suggestion > for a different design? > > TIA, > Herouth > > -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
[GENERAL] Decreasing performance in table partitioning
Hello all. I have created a function that partitions a large table into monthly partitions. Since the name of the table, target schema for partitions, name of the date field etc. are all passed as strings, the function is heavily based on EXECUTE statements. My problem is the main loop, in which data for one month is moved from the old table to the partition table. (1) EXECUTE FORMAT ( 'WITH del AS ( DELETE FROM %1$I.%2$I WHERE %3$I >= %4$L AND %3$I < %5$L RETURNING * ) INSERT INTO %6$I.%7$I SELECT * FROM del', p_main_schema, p_table_name, p_date_field_name, v_curr_month_str, v_curr_month_to_str, p_partition_schema, v_partition_name ); In the first few iterations, this runs in very good times. But as iterations progress, performance drops, despite the size of the date for each month being more or less the same. Eventually I end up with iterations that run for hours, when I started with only a few minutes. The odd thing is that the last iteration, which is actually for a month not yet inserted into that table (0 records to move) it took 6 hours for the above statement to run! I tried to improve this, by first testing whether there are any records for the current month in the table, adding: (2) EXECUTE FORMAT ( 'SELECT true FROM %1$I.%2$I WHERE %3$I >= %4$L AND %3$I < %5$L LIMIT 1', p_main_schema, p_table_name, p_date_field_name, v_curr_month_str, v_curr_month_to_str ) INTO v_exists; Before the above statement, and putting it in an IF statement on v_exists. Also, after each move, I added: EXECUTE FORMAT ( 'ANALYZE %I.%I', p_main_schema, p_table_name ); But to no avail. In fact, in each iteration, the execution of statement 2 above takes more and more time. Here is the number of rows in each month for the table I was trying to partition: count | the_month --+- 10907117 | 2013-08-01 00:00:00 12715234 | 2013-09-01 00:00:00 14902928 | 2013-10-01 00:00:00 10933566 | 2013-11-01 00:00:00 11394906 | 2013-12-01 00:00:00 9181051 | 2014-01-01 00:00:00 8487028 | 2014-02-01 00:00:00 9892981 | 2014-03-01 00:00:00 8830191 | 2014-04-01 00:00:00 8368638 | 2014-05-01 00:00:00 8014685 | 2014-06-01 00:00:00 6780589 | 2014-07-01 00:00:00 And the times for each iteration: MonthStatement 2 Statement 1 2013-08 3 sec3 min 2013-09 2 min17 min 2013-10 4 min21 min 2013-11 8 min20 min 2013-12 9 min32 min 2014-01 16 min 21 min 2014-02 19 min 20 min 2014-03 14 min 23 min For April I had to cancel it in the middle. My problem is that I can't let this run into the evening, when we have backup followed by large data collection. These times are just for the given statements, and additional time is spent creating indexes on the partitions and so on. So this thing ran from 11:24 until I had to cancel it at around 6PM. Can anybody explain the performance deterioration and/or offer a suggestion for a different design? TIA, Herouth
Re: [GENERAL] How do I track down a possible locking problem?
On 18/02/2014, at 19:02, Jeff Janes wrote: > On Mon, Feb 17, 2014 at 8:45 AM, Herouth Maoz wrote: > I have a production system using Postgresql 9.1.2. > > The system basically receives messages, puts them in a queue, and then > several parallel modules, each in its own thread, read from that queue, and > perform two inserts, then release the message to the next queue for > non-database-related processing. > > Today, after we received complaints from a customer about delays, I noticed > odd behavior in the system. About 2 percent of the messages were inserted > into the tables more than an hour after they got into the system. > > How do you know that? The message gets a time stamp (from Java) as it goes into the system. This time stamp is written into the first table, in a field named "time_arrived". The second table's insert (that follows immediately after the first) has a date_inserted field, which also gets a Java time stamp. So basically date_inserted - time_arrived is the interval that passed between the time the message came into the system and the time of the second insertion. That is - the time that it spent in the queue, plus the time that it spent in the first insert. > > > The queue never has more than 27,000 messages at the same time, and all > together, the parallel modules process about 5000 or 6000 messages per > minute. So basically, the delay for a single message should never be more > than a few minutes. Even if one module gets stuck, another will grab the next > message from the queue. I believe the only way for a message to be stuck for > so long would be for it to be grabbed by a module, and then for the database > write to be somehow delayed for a hour, although it's merely a simple insert > performed with a prepared statement. > > The database in production is very busy with millions of writes per hour. > Could there be a situation in which a particular connection gets "starved" > while other connections are able to run queries without noticeable delay? > > If there is a delay like that, it would almost certainly be due to database > locks that show up in pg_locks. > > http://www.postgresql.org/docs/current/static/view-pg-locks.html > http://wiki.postgresql.org/wiki/Lock_Monitoring > http://wiki.postgresql.org/wiki/Lock_dependency_information > > But, I doubt that that is your problem. > > > How can I truck such locks down? Does anybody have any ideas other than > starvation? The system lets me view statistics of how many messages were > processed in each modules and the average latency. None of the four modules > running has long average latency or low number of messages processes, so I > don't think the issue is related to any particular thread in my (Java) system > being slow or locked away by the others. > > If the insert into PostgreSQL was freezing, wouldn't that time get reflected > in your latency monitoring? Not sure what you mean. I think the insert is freezing, and indeed it gets reflected in the time monitored. > > It sounds to me like your application has a bug in its queue, where it > forgets about items on the queue for a while. > Not likely. This application has been running with the same queue implementation since 2001, even before we started using PostgreSQL... All bugs in that particular code would have been eliminated long ago. The system consists of dozens of queues, and we have monitoring that alerts us to any queue where messages are left unattended. If there was such a bug, we'd see queues that never become empty. But anyway, I'll put up some cron job that monitors the locks or the query lengths, and see if I can come up with anything. Theoretically, I have a way of logging when each message is enqueued or dequeued from each queue in the system, but this would slow down the production system, so I'll only do that as a last resort. Thank you for pointing me to the lock monitoring documentation.
Re: [GENERAL] How do I track down a possible locking problem?
Is there a more appropriate place to ask this question? Or was my question unclear? I dug some data, and it seems that whenever messages come at a rate of 75,000 per hour, they start picking delays of up to 10 minutes. If I go up to 100,000, delays pick up to about 20 minutes. And for 300,000 in one hour, I get delays of up to 3 hours or so. Typically, during an hour in which 250,000 messages were processed, around 10% of them are delayed more than 20 minutes, and some for more than an hour. Can anybody offer any insight? Do any of you get delays in inserts performed at this rate (250,000 per hour), meaning that the insert transaction takes more than 10 minutes? Is it a matter of fine-tuning the server? Note that at the same time there are other processes who also perform updates on the same tables, at about the same rate. So each of the tables gets a write about 500,000 times per hour. The table normally contains around 2-3 million records, and has 3 indexes. Thank you, Herouth On 17/02/2014, at 18:45, Herouth Maoz wrote: > I have a production system using Postgresql 9.1.2. > > The system basically receives messages, puts them in a queue, and then > several parallel modules, each in its own thread, read from that queue, and > perform two inserts, then release the message to the next queue for > non-database-related processing. > > Today, after we received complaints from a customer about delays, I noticed > odd behavior in the system. About 2 percent of the messages were inserted > into the tables more than an hour after they got into the system. > > The queue never has more than 27,000 messages at the same time, and all > together, the parallel modules process about 5000 or 6000 messages per > minute. So basically, the delay for a single message should never be more > than a few minutes. Even if one module gets stuck, another will grab the next > message from the queue. I believe the only way for a message to be stuck for > so long would be for it to be grabbed by a module, and then for the database > write to be somehow delayed for a hour, although it's merely a simple insert > performed with a prepared statement. > > The database in production is very busy with millions of writes per hour. > Could there be a situation in which a particular connection gets "starved" > while other connections are able to run queries without noticeable delay? > > How can I truck such locks down? Does anybody have any ideas other than > starvation? The system lets me view statistics of how many messages were > processed in each modules and the average latency. None of the four modules > running has long average latency or low number of messages processes, so I > don't think the issue is related to any particular thread in my (Java) system > being slow or locked away by the others. > > TIA, > Herouth >
[GENERAL] How do I track down a possible locking problem?
I have a production system using Postgresql 9.1.2. The system basically receives messages, puts them in a queue, and then several parallel modules, each in its own thread, read from that queue, and perform two inserts, then release the message to the next queue for non-database-related processing. Today, after we received complaints from a customer about delays, I noticed odd behavior in the system. About 2 percent of the messages were inserted into the tables more than an hour after they got into the system. The queue never has more than 27,000 messages at the same time, and all together, the parallel modules process about 5000 or 6000 messages per minute. So basically, the delay for a single message should never be more than a few minutes. Even if one module gets stuck, another will grab the next message from the queue. I believe the only way for a message to be stuck for so long would be for it to be grabbed by a module, and then for the database write to be somehow delayed for a hour, although it's merely a simple insert performed with a prepared statement. The database in production is very busy with millions of writes per hour. Could there be a situation in which a particular connection gets "starved" while other connections are able to run queries without noticeable delay? How can I truck such locks down? Does anybody have any ideas other than starvation? The system lets me view statistics of how many messages were processed in each modules and the average latency. None of the four modules running has long average latency or low number of messages processes, so I don't think the issue is related to any particular thread in my (Java) system being slow or locked away by the others. TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question about optimizing access to a table.
On 10/12/2013, at 20:55, Kevin Grittner wrote: > Herouth Maoz wrote: > >> The problem starts when our partner has some glitch, under high >> load, and fails to send back a few hundred thousand reports. In >> that case, the table grows to a few hundred records, and they are >> not deleted until they hit their expiry date, at which point the >> "garbage collector" takes care of them and everything goes back >> to normal. When it contains hundreds of thousands of records, >> performance deteriorates considerably- > > First, make sure that you are on the latest minor release of > whatever major release you are running. There were some serious > problems with autovacuum's table truncation when a table was used > as a queue and size fluctuated. These are fixed in the latest set > of minor releases. Thank you. Indeed, I failed to mention which version of PostgreSQL I was on. 9.1.2 in this case. Do you mean that I have to go to 9.3.x or simply to 9.1.11? > If that doesn't clear up the problem, please post an actual slow > query to the pgsql-performance list, with its EXPLAIN ANALYZE > output and other details, as suggested here: > > http://wiki.postgresql.org/wiki/SlowQueryQuestions > > People will be able to provide more useful and specific advice if > they have the additional detail. Thank you. I think it's more a matter of design than an issue with the query. The queries themselves are the simplest form of SELECT and DELETE possible. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Question about optimizing access to a table.
On 10/12/2013, at 20:55, Jeff Janes wrote: > > On Tue, Dec 10, 2013 at 8:23 AM, Herouth Maoz wrote: > > Hello. > > I have one particular table with very specialized use. I am sending messages > to some partner. The partner processes them asynchronously, and then returns > the status report to me. The table is used to store a serialized version of > the message object, together with a few identifiers, expiry date, and a > reference from the partner. When the status report comes back from the > partner, we: > > Select the record using the partner's reference number > reconstruct and queue the object. > Delete the record from database using the serial primary key ID. > > Where do you "queue" the object? Back into a different table within the same > database? Why not use "DELETE ...RETURNING"? The object is used in a Java application. Basically, it goes into a queue in memory. Some of its data is updated in two different tables in the same database at a later time, but only after some logic has been performed which may or may not change them relative to what they were when they came from the partner. I'm not using DELETE...RETURNING because basically the application that's running is old code, originally written around 2002 or so, before we even had PostgreSQL in the organization. I will probably not get permission to change this code without very good reason as it is a sensitive bit of code. > > > Every once in a while we run a garbage collection process which selects > expired messages, reconstructs and queues them for processing, and deletes > the record using the primary key. > > This works wonderfully as long as the table remains small - a few thousand > records waiting for their status report, and that's it. The table is set to > have frequent auto-anylize runs. > > You might want to turn ...yes? > > > The problem starts when our partner has some glitch, under high load, and > fails to send back a few hundred thousand reports. In that case, the table > grows to a few hundred records, and they are not deleted until they hit their > expiry date, at which point the "garbage collector" takes care of them and > everything goes back to normal. When it contains hundreds of thousands of > records, performance deteriorates considerably. > > There is no inherent reason the performance needs to degrade. Can you give > the specific queries that perform worse? They are the simplest queries possible. The performance deterioration is only fractions of a second, you see. But when we have thousands of status reports per minute, it builds up to a large delay after a few minutes. The queries are: select id, sm from transient where smsc_reference = ? and msisdn = ? and then: delete from transient where id = ? > > > I am trying to figure out a solution that will keep the system working well > even when there is a large number of records in the table. At first I thought > of partitioning the table on the partner's reference field. But the DELETE > would actually slow down if I do this, right? > > Not necessarily, but partitioning should be your last resort not your first > resort, and I don't see any reason it would be needed here. > > Indexes: > "transient_pkey" PRIMARY KEY, btree (id) > "transient_msisdn_ref" btree (msisdn, smsc_reference) > "transient_sm_vp" btree (validity) > > What is the partner reference? If it is smsc_reference, then you probably > need a index in which that is the lead (or only) column. Well, I simplified a bit, but as you can see from the query above, it queries on both the smsc_reference and the msisdn. Thanks, Herouth
[GENERAL] Question about optimizing access to a table.
Hello. I have one particular table with very specialized use. I am sending messages to some partner. The partner processes them asynchronously, and then returns the status report to me. The table is used to store a serialized version of the message object, together with a few identifiers, expiry date, and a reference from the partner. When the status report comes back from the partner, we: Select the record using the partner's reference number reconstruct and queue the object. Delete the record from database using the serial primary key ID. Every once in a while we run a garbage collection process which selects expired messages, reconstructs and queues them for processing, and deletes the record using the primary key. This works wonderfully as long as the table remains small - a few thousand records waiting for their status report, and that's it. The table is set to have frequent auto-anylize runs. The problem starts when our partner has some glitch, under high load, and fails to send back a few hundred thousand reports. In that case, the table grows to a few hundred records, and they are not deleted until they hit their expiry date, at which point the "garbage collector" takes care of them and everything goes back to normal. When it contains hundreds of thousands of records, performance deteriorates considerably. I am trying to figure out a solution that will keep the system working well even when there is a large number of records in the table. At first I thought of partitioning the table on the partner's reference field. But the DELETE would actually slow down if I do this, right? Any ideas? Here is the table structure (some irrelevant fields cut): Column |Type | Modifiers -+-+ id | bigint | not null default nextval('transient_id_seq'::regclass) time_submitted | timestamp without time zone | not null validity| timestamp without time zone | msisdn | character varying(16) | not null sme_reference | integer | not null smsc_reference | numeric(21,0) | not null sm | text| Indexes: "transient_pkey" PRIMARY KEY, btree (id) "transient_msisdn_ref" btree (msisdn, smsc_reference) "transient_sm_vp" btree (validity) Has OIDs: no Options: autovacuum_enabled=true, autovacuum_analyze_threshold=200, autovacuum_analyze_scale_factor=0.001 TIA, Herouth
Re: [GENERAL] Table partitioning
Everything is slow about it - selects, deletes and inserts, that is. I don't do updates on that table. The inserts and deletes are less of an issue because they are done once a week. Of course it would be nicer if they were faster, but that's less of an issue. The real issue is with self-joins, which are a common query. But I have indexes on the relevant fields: the connecting field (the one used for the self-join) as well as the date field. The queries are mostly of the format SELECT ... FROM tablename t1 left join tablename t2 on t1.indexed_field = t2.indexed_field and t2.boolean_field where t1.date_field >= 'date1' and t1.date_field < 'date2' and not t1.boolean; This can take about 15 minutes to run, depending on the date range. And it doesn't matter much if I put a date range within the left join, either. As I mentioned in another message, I don't like the idea that my script may fail, and then inserts into the table might start to fail as well. I'm not always available to run it manually within a set time. And writing an automation that is different than all the other tables I maintain in that database makes for maintenance spaghetti. I also don't like running automated DDL commands. They don't play well with backups. -הודעה מקורית- מאת: Steve Crawford [mailto:scrawf...@pinpointresearch.com] נשלח: ב 28/10/2013 22:31 אל: Herouth Maoz; pgsql-general@postgresql.org נושא: Re: [GENERAL] Table partitioning On 10/28/2013 09:27 AM, Herouth Maoz wrote: > I have a rather large and slow table in Postgresql 9.1. I'm thinking of > partitioning it by months, but I don't like the idea of creating and dropping > tables all the time. What is slow about it? Inserting? Selecting? Deleting? Partitioning can assist with some issues but does no good if what you really need is an index or better query. Partitioning shines as an option to manage archiving/purging of time-series data but only if you work with it, not against it. What don't you like about creating and dropping tables? You can easily automate it: https://github.com/keithf4/pg_partman > > I'm thinking of simply creating 12 child tables, in which the check condition > will be, for example, date_part('month'', time_arrived) = 1 (or 2 for > February, 3 for March etc.). > > I'll just be deleting records rather than dropping tables, the same way I do > in my current setup. I delete a week's worth every time. You are missing out on one of the best aspects of partitioning. Compared to dropping or truncating a child table, deleting is far slower and causes table bloat which may impact future queries. > > Second, when I delete (not drop!) from the mother table, are records deleted > automatically from the child tables or do I need to create rules/triggers for > that? > Yes unless you use the keyword "ONLY": "If specified, deletes rows from the named table only. When not specified, any tables inheriting from the named table are also processed." Cheers, Steve
Re: [GENERAL] Table partitioning
Thanks. Assuming there is an index on the time_arrived column, and that there are about 10.5 million records in each child table, how bad will performance be if the query actually accesses all the 12 tables? Will it be as bad as using the full table? On 28/10/2013, at 18:31, Elliot wrote: > On 2013-10-28 12:27, Herouth Maoz wrote: >> I have a rather large and slow table in Postgresql 9.1. I'm thinking of >> partitioning it by months, but I don't like the idea of creating and >> dropping tables all the time. >> >> I'm thinking of simply creating 12 child tables, in which the check >> condition will be, for example, date_part('month'', time_arrived) = 1 (or 2 >> for February, 3 for March etc.). >> >> I'll just be deleting records rather than dropping tables, the same way I do >> in my current setup. I delete a week's worth every time. >> >> So, I have two questions. >> >> First, is constraint exclusion going to work with that kind of condition? I >> mean, if my WHERE clause says something like "time_arrived >= '2013-04-05' >> and time_arrived < '2013-04-17'", will it be able to tell that >> date_part("month",time_arrived) for all the records is 4, and therefore >> avoid selecting from any partitions other than the april one? >> >> Second, when I delete (not drop!) from the mother table, are records deleted >> automatically from the child tables or do I need to create rules/triggers >> for that? >> >> >> TIA, >> Herouth >> > 1. No - you'd need a condition like "where date_part("month", time_arrived) = > 1" in your select statements in order for the constraint exclusion to kick in > 2. Yes - there is no need to create rules or triggers for deletes on the > parent table (check out the syntax for "delete from " versus "delete > from only ) > -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Table partitioning
I have a rather large and slow table in Postgresql 9.1. I'm thinking of partitioning it by months, but I don't like the idea of creating and dropping tables all the time. I'm thinking of simply creating 12 child tables, in which the check condition will be, for example, date_part('month'', time_arrived) = 1 (or 2 for February, 3 for March etc.). I'll just be deleting records rather than dropping tables, the same way I do in my current setup. I delete a week's worth every time. So, I have two questions. First, is constraint exclusion going to work with that kind of condition? I mean, if my WHERE clause says something like "time_arrived >= '2013-04-05' and time_arrived < '2013-04-17'", will it be able to tell that date_part("month",time_arrived) for all the records is 4, and therefore avoid selecting from any partitions other than the april one? Second, when I delete (not drop!) from the mother table, are records deleted automatically from the child tables or do I need to create rules/triggers for that? TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index creation takes more time?
On 18/09/2012, at 20:19, Jeff Janes wrote: > I think the one below will show an even larger discrepancy. You are > doing 2 casts for each comparison, > so I think the casts overhead will dilute out the comparison. > > select count(distinct foo) from ( select cast(random() as varchar(14)) as foo > from generate_series (1,1)) asdf; Actually, it doesn't. I suspect that it doesn't actually do string comparison per se. I don't know how "distinct" is implemented in PostgreSQL, but if it was me, I'd implement it with a hash table, which means that you calculate the hash of the string rather than compare it. Even if it is done with actual comparison, I don't think it's a collation-based comparison, but rather a byte-by-byte comparison. > > >> Finally, I created a test table, as you asked: >> >> >>> create table foo as select msisdn,sme_reference from >>> sms.billing__archive limit 100; >> >> Then I created an index on the msisdn and sme_reference columns together. >> 99% of the data in the msisdn field consist of 11-digit phone numbers. >> Result: >> >> PC: 5792.641 ms >> Server: 23740.470 ms >> >> Huge discrepancy there. > > try: > create index ON foo (msisdn COLLATE "C", sme_reference) ; > > This can only be done on 9.1 server, as that feature is new to that > release. It should be much faster to create than the index with > default collation. > > (or change the collation of msisdn column definition, rather than just > in the index). > > This assumes you just need the index for equality, not for some > precise locale-specific ordering (which for phone numbers seems like a > safe bet). Yes, this certainly reduced the index creation time to within a reasonable margin. OK, now we have to decide whether to move the entire database to the 'C' collation (which would require, I suppose, a dump and restore) with the option of changing collation for specific columns that actually need it, or to just solve the current problem by changing the index creation commands where relevant. Thank you very much for your help with this issue, your input has been invaluable. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Index creation takes more time?
I think you hit the nail right on the head when you asked: > I wonder if they have different encoding/collations. [headdesk]Of course. One of the requirements of the upgrade was to change the database encoding to unicode, because previously it was in an 8-bit encoding and we couldn't handle international text, which has become an absolute necessity. So when I restored the database, I took care to create it in unicode first: Old database (PC, 8.3.17): Name | reports Owner| reports Encoding | ISO_8859_8 (Locale is C) New database (server, 9.1.5): Name | reports Owner | reports Encoding | UTF8 Collate | he_IL.UTF-8 Ctype | he_IL.UTF-8 Apparently, string comparison is heavily CPU bound... Now, it seems the server is inferior to the PC in CPU-bound tasks. It's no wonder - the PC has a better and faster CPU, and each PostgreSQL process runs on a single core, so the 4 cores are not an advantage. So running the test you asked: > \timing on > set work_mem = 16384; > select count(distinct foo) from (select random() as foo from > generate_series(1,1)) asdf; I get PC: Time: 554994.343 ms Server: Time: 660577.789 ms Which is slightly better in favor of the PC, but still doesn't show as much of a discrepancy as in the creation of indexes. I must point out that the actual problem is not in comparison to this PC's hardware. The database originally ran on the server, and created the same indexes happily within reasonable time until the upgrade. The upgrade process involved shutting down PostgreSQL, moving all the database files and configuration over to the PC, and starting it there (running the PC under the old server's IP, so that all the clients work with the PC now as a production machine). Then we took the server, upgraded the system and PostgreSQL on it, created a dump from the PC, and restored it on the Server. So the situation is that the performance is 4 times worse w.r.t. the same hardware, which chugged happily when it was still the old operating system and the old PostgreSQL. And the PC is still chugging away happily during the archive, albeit a bit more slowly (for I/O reasons - it is inferior to the server there). Anything disk-bound is done better on the Server, while the PC has a slight CPU advantage. So, I must, at this point, draw the conclusion that string comparison is a much, much heavier task in utf-8 than it is in an 8-bit encoding - or that the collation is the problem. Running a different test, which involves string comparison, shows a bigger discrepancy: select count( foo ), foo from ( select cast(random() as varchar(14)) > cast(random() as varchar(14)) as foo from generate_series (1,1)) asdf group by foo; PC: Time: 308152.090 ms Server: Time: 499631.553 ms Finally, I created a test table, as you asked: > create table foo as select msisdn,sme_reference from > sms.billing__archive limit 100; Then I created an index on the msisdn and sme_reference columns together. 99% of the data in the msisdn field consist of 11-digit phone numbers. Result: PC: 5792.641 ms Server: 23740.470 ms Huge discrepancy there. Next, I dropped the index, and created an index on the sme_reference column alone (which is an integer). The result: PC: 2457.315 ms Server: 3722.920 ms Still a slight advantage for the PC, but not on the same order of magnitude as when strings were concerned. OK, if you agree with my conclusions, what should I do about it? I absolutely need this database to be able to support Unicode. Thanks a lot for the help so far! Herouth
Re: [GENERAL] Index creation takes more time?
Yes, thank you, I did notice it, but I decided to wait a week to the next archive schedule, to see if the problem persists, especially since the previous time ran with relatively low disk space because we kept the old database files around. We have removed them during the week. Unfortunately, the problem persists. So here is the information I could glean. First, the variable "maintenance_work_mem" has not been changed between the old and new postgresql. In fact, it is commented out, so I assume it's the default 16MB for both installations. The server that runs the 9.1 is generally better and faster than the PC that runs the 8.3 (it does every other operation - inserts, updates, selects - much faster than the PC). More specifically: Server running 9.1: 3373252k of memory Two hard disks, separate for system and database. The database disk is 15000RPM, 825G. CPU: Xeon, 2.0GHz, 4 cores (or two CPUs with 2 cores, I'm not sure) PC running 8.3: 3073344k of memory One SATA hard disk (used for both system and database), 7200RPM, 915G. CPU: Pentium dual-core 2.80GHz In both machines postgreSQL is set up with shared_buffers of 1800M. Now, the table itself: Column|Type | Modifiers -+-+--- service | smallint| billing_priority| smallint| account_number | integer | msisdn | character varying(16) | sme_reference | integer | smsc_reference | numeric(21,0) | gateway_id | smallint| user_reference | numeric(21,0) | user_time | timestamp without time zone | time_arrived| timestamp without time zone | time_submitted | timestamp without time zone | time_final_state| timestamp without time zone | status | integer | time_notified | timestamp without time zone | user_id | character varying(45) | price | double precision| sms_engine_id | character varying(15) | smsc_session_id | character varying(64) | external_billing_reference | character varying(128) | multipart_reference | numeric(21,0) | multipart_nr_segments | integer | multipart_segment_nr| integer | requested_target_network_id | character(1)| actual_target_network_id| character(1)| sm_type | character(2)| There are no triggers, no foreign keys etc. The index definitions: CREATE INDEX billinga_user_id ON sms.billing__archive(user_id) ; CREATE INDEX billinga_status ON sms.billing__archive(status) ; CREATE INDEX billinga_time_arrived ON sms.billing__archive(time_arrived) ; CREATE INDEX billinga_msisdn_sme_reference ON sms.billing__archive(msisdn,sme_reference) ; CREATE INDEX billinga_account ON sms.billing__archive(account_number) ; CREATE INDEX billinga_user_ref ON sms.billing__archive(user_reference) ; CREATE INDEX billinga_smsc_ref ON sms.billing__archive (smsc_reference) ; CREATE INDEX billinga_time_submitted ON sms.billing__archive(time_submitted) ; Statistics collection: For the sake of experimentation, I dropped and created the billinga_msisdn_sme_reference in both machines, timed it, and ran vmstat, iostat and sar in the background at intervals of 1 minute. On the PC, the creation of the index took 40 minutes 35 seconds. The server (9.1) has not finished yet. I set up stats to run for an hour, and I'm sending this hour's worth of stats. I'm attaching the stats files in tarballs. I'm not sure what I'm supposed to look at. Thanks for your time, Herouth -הודעה מקורית- מאת: Craig Ringer [mailto:ring...@ringerc.id.au] נשלח: ב 17/09/2012 06:56 אל: Herouth Maoz עותק לידיעה: pgsql-general@postgresql.org; t...@fuzzy.cz נושא: Re: [GENERAL] Index creation takes more time? Herouth, I don't know if you saw Tomas Vondra's follow-up, as it was only to the list and not CC'd to you. Here's the archive link: http://archives.postgresql.org/message-id/e87a2f7a91ce1fca7143bcadc4553...@fuzzy.cz The short version: "More information required". On 09/09/2012 05:25 PM, Herouth Maoz wrote: > We have tables which we archive and shorten every day. That is - the main > table that has daily inserts and updates is kept small, and there is a > parallel table with all the old data up to a year ago. > > In the past we noticed that the bulk transfer from the main table to the > archive tabl
Re: [GENERAL] Is there a way to use "pack" in pl/perl without resorting to pl/perlu?
Thanks. I was hoping there was a way to enable individual operators through the postgresql configuration file or something. Anyway, I replaced the functionality with $content =~ s/([a-fA-F0-9]{2})/chr(hex($1))/eg; which seems to be doing the same thing as unpack( "H*", $content ), which is basically what I needed for the time being. I suspect it's less efficient than unpack, and I hope the function I created won't be too slow for use inside a trigger. Thanks, Herouth On 12/09/2012, at 17:47, Tom Lane wrote: > Herouth Maoz writes: >> I created a function that does some heavy string manipulation, so I needed >> to use pl/perl rather than pl/pgsql. >> I'm not experienced in perl, but the function works well when used as an >> independent perl subroutine - it depends only on its arguments. I use the >> Encode package (in postgresql configuration). > >> But my use of the "pack" function causes a Safe error. Apparently "pack" and >> "unpack" are in the operator mask. > >> Is there any way to remove them from that mask? > > My recollection is that they're intentionally excluded because they > would otherwise be security holes --- there are some format codes that > allow direct access to memory, or something like that. > > regards, tom lane -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
[GENERAL] Is there a way to use "pack" in pl/perl without resorting to pl/perlu?
I created a function that does some heavy string manipulation, so I needed to use pl/perl rather than pl/pgsql. I'm not experienced in perl, but the function works well when used as an independent perl subroutine - it depends only on its arguments. I use the Encode package (in postgresql configuration). But my use of the "pack" function causes a Safe error. Apparently "pack" and "unpack" are in the operator mask. Is there any way to remove them from that mask? I don't want to use pl/perlu - it's too unsafe, I think, and its running environment is different. I just want these two operators. This is what I'm using "pack" for: I convert a string in the format '43414C4C5F494445' into a binary string, and then, given a matching text encoding, use "decode" to convert it to proper unicode. TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Index creation takes more time?
We have tables which we archive and shorten every day. That is - the main table that has daily inserts and updates is kept small, and there is a parallel table with all the old data up to a year ago. In the past we noticed that the bulk transfer from the main table to the archive table takes a very long time, so we decided to do this in three steps: (1) drop indexes on the archive table, (2) insert a week's worth of data into the archive table. (3) recreate the indexes. This proved to take much less time than having each row update the index. However, this week we finally upgraded from PG 8.3 to 9.1, and suddenly, the archiving process takes a lot more time than it used to - 14:30 hours for the most important table, to be exact, spent only on index creation. The same work running on the same data in 8.3 on a much weaker PC took merely 4:30 hours. There are 8 indexes on the archive table. The size of the main table is currently (after archive) 7,805,009 records. The size of the archive table is currently 177,328,412 records. Has there been a major change in index creation that would cause 9.1 to do it this much slower? Should I go back to simply copying over the data or is the whole concept breaking down? TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Maintaining a materialized view only on a replica
It's not an issue with the replication software. The reason the parts of the transaction are written out of order is that the original system that writes them in the first place makes no guarantees as to the order of writing. So basically my question is whether a trigger that runs a full aggregate SQL query on the table that triggered it, joining with another table, checking the rows returned and doing the insert in the second table only when the data is complete is feasible, because that's basically what I need to do. Herouth On 05/09/2012, at 00:52, Craig Ringer wrote: > Subject changed to describe the problem. Reply in-line. > > On 09/04/2012 07:57 PM, Herouth Maoz wrote: > >> The issue is that when an insert or an update is fired, I can't say >> whether all the segments of the same transaction have been written yet, >> and if only some of them were written, there is no guarantee on the >> order in which they are written. > > Does Slony-I provide stronger guarantees? If your replication doesn't > guarantee ordering then you're going to have a very hard time doing this. > >> Is this >> feasible at all? How would you achieve it? > > I'd try to find a replication system that guaranteed ordering if at all > possible. > > -- > Craig Ringer -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
[GENERAL] I want your opinion on how to do something.
Basically, I have several production databases with various data, and I have a reports database that grabs all necessary data once a day. Now, there is is a new requirement to have some of the data available in the reports database as soon as it is inserted in the production database. Specifically, the data in question is in two particular tables. However, in addition to just shipping the tables in, I also need to perform a bit of processing on the data as it comes. Basically, each transaction in production is represented by 1-3 rows in the two tables. One of the fields contains urlencoded data, which I need to decode and concatenate, so that the rows are represented by one row in another table. E.g. Table A brought from production: key field 1 | key field 2 | num of segments | segment num | segment id | +-+-+-+| abcde | 134 | 3 | 1 | 999000 | abcde | 567 | 3 | 3 | 999000 | abcde | 890 | 3 | 2 | 999000 | fghij | 987 | 2 | 1 | 999001 | fghij | 654 | 2 | 2 | 999001 | abcde | -11 | 1 | 1 | 999003 | Table B from production key field 1 | key field 2 | urlencoded data | +-+-+ abcde | 134 | AAA | abcde | 567 | CCC | abcde | 890 | BBB | fghij | 987 | fff | fghij | 654 | ggg | abcde | -11 | XXX | Here we have basically three transactions - one with three segments, one with two, and one with a single segment. The data that identifies that certain rows belong to the same transation and what the order is is in table A. The actual data to decode is in table B. The result I need to produce is a new table like this: key field 1 | key field 2 | segment id | decoded concatenated data| +-++--| abcde | 134 | 999000 | AAABBBCCC| fghij | 987 | 999001 | fffggg | abcde | -11 | 999003 | XXX | Basically, a single row for each transaction, with the key fields taken from the original's first segment, and the data decoded and concatenated. But I need this to be done in (near) real-time - as the rows are added. Because of the decoding and concatenation requirements, this can't be a view, because there is no SQL function that will do the required processing, and I'm not sure a view would be efficient for running reports anyway. So a new table it will be. Basically, I can replicate these two tables from production to the reports database using Londiste, which allows me to replicate single tables and is also supposed to allow me to run triggers on the tables. What I thought of doing was create a trigger on one of the tables, such that when data is inserted, it will insert data into the "result" table, and write this trigger using PL/Perl or PL/Python which I'm guessing will allow me to do the string manipulation required. The issue is that when an insert or an update is fired, I can't say whether all the segments of the same transaction have been written yet, and if only some of them were written, there is no guarantee on the order in which they are written. The question is whether such a trigger can be created at all. It needs to verify that all parts are in the database - run a query on the same table at the time the trigger is running. If not all data is there, it shouldn't insert anything. Is this feasible at all? How would you achieve it? Thank you, Herouth
Re: [GENERAL] Why is an ISO-8859-8 database allowing values not within that set?
Thanks. That makes sense. The default client encoding on the reports database is ISO-8859-8, so I guess when I don't set it using \encoding, it does exactly what you say. OK, so I'm still looking for a way to convert illegal characters into something that won't collide with my encoding (asterisks or whatever). Thank you, Herouth On 21/07/2012, at 15:36, Craig Ringer wrote: > On 07/21/2012 04:59 PM, Herouth Maoz wrote: >> I am using Postgresql 8.3.14 on our reporting system. There are scripts that >> collect data from many databases across the firm into this database. >> Recently I added tables from a particular database which has encoding UTF-8. >> > > First, I know there have been encoding and UTF-8 handling fixes since 8.3 . > It'd be interesting to see if this still happens on a more recent version. > > You're also missing five bug-fix point-releases in the 8.3 series, as the > latest is 8.3.19 . See: >http://www.postgresql.org/docs/8.3/static/release.html > for fixes you're missing. > > Explanation for what I think is going on below: > >> >> But this puzzles me, because I then took the file >> > ... which was created with a \copy with client encoding set to utf-8, right? >> ran psql and \copy from file >> > > With which client encoding set? UTF-8 or ISO_8859_8? I bet you copied it in > with ISO_8859_1. >> And it worked. I tried it again now, and I can see the row with its Arabic >> content, even though it is not in the database encoding. > It shows up correctly? > > If you \copy a dump in utf-8, then \copy it back in with ISO_8859_8, it > should be mangled. > > If you set your client_encoding to utf_8 ("\encoding utf-8") does it still > show up correctly? I suspect it's wrong in the database and you're just > unmangling it on display. > > It would help if you would actually show the bytes of: > > - The chars in the \copy dump, using `xxd' or similar > - The chars in the database before the copy out and copy in, using a CAST to > `bytea` > - The chars in the database AFTER the copy out and copy in, again with a CAST > to `bytea` > > ... as well as the database encoding, NOT just the client encoding (see > below): >> I checked \encoding. It replies >> ISO_8859_8 > That is the client encoding. > > Try: > >\l+ > > to list databases. You'll see the database encoding there. The same info is > available from: > > SELECT datname, encoding from pg_database WHERE datname = 'mydatabase'; > > > Maybe this demo will help enlighten you. > > regress=# select version(); > version > - > PostgreSQL 9.1.4 on x86_64-redhat-linux-gnu, compiled by gcc (GCC) 4.7.0 > 20120507 (Red Hat 4.7.0-5), 64-bit (1 row) regress=# \l+ > List of databases > Name| Owner| Encoding | Collate |Ctype| > Access privileges | Size | Tablespace |Description > > ++--+-+-+---+-++ > regress| craig | UTF8 | en_US.UTF-8 | > en_US.UTF-8 | | 41 MB | pg_default | regress=# > CREATE TABLE enctest (a serial, x text); CREATE TABLE regress=# -- Some text > randomly pulled off Google News taiwan, since it was convenient and the exact > text doesn't matter regress=# insert into enctest (x) values ('心情已平復很多」。 > 中國網絡電視台報導'); INSERT 0 1 regress=# \encoding UTF8 regress=# \copy ( select x > from enctest ) to enctest.csv Set a 1-byte non-utf encoding, doesn't really > matter which one. Then import the data we dumped as utf-8. regress=# > \encoding latin-1 regress=# \copy enctest(x) from enctest.csv enctest now > contains two rows. One is the correctly encoded original, one is the dumped > and reloaded one. We can't view the whole table while we're in latin-1 > encoding because the correct row won't translate right. regress=# select * > from enctest; ERROR: character 0xe5bf83 of encoding "UTF8" has no equivalent > in "LATIN1" but we *CAN* view the second row we dumped as utf-8 then imported > as latin-1: regress=# regress=# select * from enctest where a = 2; a | >
[GENERAL] Why is an ISO-8859-8 database allowing values not within that set?
I am using Postgresql 8.3.14 on our reporting system. There are scripts that collect data from many databases across the firm into this database. Recently I added tables from a particular database which has encoding UTF-8. My dump procedure says \encoding ISO-8859-8 \copy ( SELECT ... ) to file And this fails at a certain row because that row contains Arabic text and it cannot be mapped into ISO-8859-8 (which is 8 bit Hebrew). This is an expected behavior, but I was wondering why, when I tested the same setup manually, it all worked well. Turns out that when I did it manually, I did not specify the output encoding. I did the \copy straight. So the file was in UTF-8. But this puzzles me, because I then took the file, ran psql and \copy from file. And it worked. I tried it again now, and I can see the row with its Arabic content, even though it is not in the database encoding. I checked \encoding. It replies ISO_8859_8 but it then happily gives me the Arabic row when I select it. What's happening here? Why does the database accept input in the wrong encoding and doesn't shout when I then try to select that input? Secondly, suppose I want to get pure ISO-8859-8 output for now, and replace every incompatible character within the select statement into '*' or whatever. Is there any function that will help me detect such characters? Can I tell the psql conversion function to ignore bad characters? Thank you, Herouth
Re: [GENERAL] Up-to-date reports database
On 23/05/2012, at 18:54, Bartosz Dmytrak wrote: > hi, > my suggestion is to redesign reporting database to fit reporting specifics > (e.g. brake normal form of database, in some cases this will speed up > reports). Than you can use some ETL tool to sync production and reporting. > Good thing is to use some OLAP software to use multidimensional analyze - > this will make queries easier (with MDX language). I think this kind of > discussion is huge one :) > > there are some opensource ETL and BI suits available. Thanks, I'll take that into consideration. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Up-to-date reports database
On 23/05/2012, at 17:20, Chris Ernst wrote: > I would have a look at slony. It is a trigger based replication system > that allows you to replicate only the tables you define and you can have > different indexing on the slave. The only requirement is that each > table you want to replicate has the same primary key or unique > constraint on the master and slave. Other than that, you can index the > tables on the slave however you want. Thanks for the pointer. I will read up about it. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Up-to-date reports database
Hi guys, I'm interested in a solution that will allow our customers to run reports - which may involve complicated queries - on data which is as up-to-date as possible. One thing I don't want to do is to let the reporting system connect to the production database. I want the indexes in production to be limited to what production needs, and not add indexes that are required for reports, for instance. And basically, I don't want a customer to run a complicated report and degrade the performance of my production system. A replication solution is not very good, either, because of course I can't define indexes differently, I don't want *all* transactions in all tables to be sent, and also, because I may want to cross reference data from different systems. So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns) from various databases are collected, and have a reporting tool connect to this database. But I want to push the data into into that database as close to real time as possible. The most important data I am currently considering are two tables which have an average of 7,600 transactions per hour (standard deviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables collected from more than one database. I assume this is not an uncommon scenario. What solutions would you recommend? Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] How do clients failover in hot standby/SR?
We are looking at a replication solution aimed at high availability. So we want to use PostgreSQL 9's streaming replication/hot standby. But I seem to be missing a very basic piece of information: suppose the primary is host1 and the secondary is host2. Suppose that when host1 fails host2 detects that and creates the trigger file that causes the secondary to act as primary. How do all clients, which have connection strings aimed at host1 know to fail over and use host2? Is there a good Internet resource for reading on this? Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Lengthy deletion
On 29/11/2011, at 09:13, Tom Lane wrote: > "Herouth Maoz" writes: >> I was instructed to delete old records from one of the tables in our >> production system. The deletion took hours and I had to stop it in >> mid-operation and reschedule it as a night job. But then I had to do the >> same when I got up in the morning and it was still running. > >> I got an interesting clue, though, when I canceled the deletion the second >> time around. I got the following error message: > >> Cancel request sent >> ERROR: canceling statement due to user request >> CONTEXT: SQL statement "SELECT 1 FROM ONLY "public"."sent_messages" x WHERE >> $1 OPERATOR(pg_catalog.=) "subscription_id" FOR SHARE OF x" > > Yup, that's a clue all right. I'll bet a nickel that you don't > have an index on the foreign key's referencing column (ie, > sent_messages.subscription_id). That means each delete in > the referenced table has to seqscan the referencing table to > see if the delete would result in an FK violation. Makes sense. But shouldn't that be figured into the EXPLAIN plan? -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
[GENERAL] Lengthy deletion
Hi. I was instructed to delete old records from one of the tables in our production system. The deletion took hours and I had to stop it in mid-operation and reschedule it as a night job. But then I had to do the same when I got up in the morning and it was still running. The odd thing about it: There are 4720965 records in the table, of which I have to delete 3203485. This should not have taken too long, and the EXPLAIN estimate for it seemed to agree with me: bcentral=> explain delete from subscriptions s where (date_registered < '2011-11-13' and operator <> 'P') and service_id not in ( select id from alerts_services ) ; QUERY PLAN - Delete (cost=38885.86..155212.37 rows=1630425 width=6) -> Bitmap Heap Scan on subscriptions s (cost=38885.86..155212.37 rows=1630425 width=6) Filter: ((date_registered < '2011-11-13 00:00:00'::timestamp without time zone) AND (operator <> 'P'::bpchar) AND (NOT (hashed SubPlan 1))) -> Bitmap Index Scan on t_ind (cost=0.00..38473.03 rows=2361115 width=0) Index Cond: ((date_registered < '2011-11-13 00:00:00'::timestamp without time zone) = true) SubPlan 1 -> Seq Scan on alerts_services (cost=0.00..4.58 rows=258 width=4) (7 rows) I got an interesting clue, though, when I canceled the deletion the second time around. I got the following error message: Cancel request sent ERROR: canceling statement due to user request CONTEXT: SQL statement "SELECT 1 FROM ONLY "public"."sent_messages" x WHERE $1 OPERATOR(pg_catalog.=) "subscription_id" FOR SHARE OF x" As you can see in the EXPLAIN sentence, I'm trying to delete from a table called "subscriptions", and this context is in another table called "sent_messages" which is related to it by foreign key. Now, I'd say that it was waiting to get a lock on the "sent_messages" table (from which I duly removed the related records before running my delete), and that I should have known that. Only, I was using another connection to monitor pg_stat_activity while the delete is done, and the delete process had "false" in the "waiting" column! bcentral=# SELECT usename, procpid, query_start, client_addr, client_port, current_query, waiting FROM pg_stat_activity WHERE query_start < now() - interval '3 seconds' AND xact_start is not null order by xact_start; -[ RECORD 1 ]-+ usename | bcentral procpid | 20047 query_start | 2011-11-29 02:01:28.968161+02 client_addr | 192.168.34.34 client_port | 55709 current_query | delete : from subscriptions s : where (date_registered < '2011-11-13' and operator <> 'P') and service_id not in ( select id fr om alerts_services ) : ; waiting | f Um... so what gives? What's happening here? The server is PostgreSQL 9.0.4. TIA, Herouth
Re: [GENERAL] What's canceling autovacuum tasks?
on 06/02/11 18:16, quoting Tom Lane: Most likely, some other session requested an exclusive lock on the table. Autovacuum will quit to avoid blocking the other query. That's strange. During the day, only selects are running on that database, or at worst, temporary tables are being created and updated. And that particular table gets updated only on weekends (it's one of my archive tables). Besides, I assume that a simple update/insert/delete is not supposed to request an exclusive lock, or autovacuum would not work at all in an average database. Even backups don't run during the day, and I think backups also don't create an exclusive lock or I'd never see a vacuum process run more than a day. This is really inexplicable. Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] What's canceling autovacuum tasks?
Hi there. During the weekend I've worked for hours on recovering table bloat. Now I was hoping that after the tables are properly trimmed, then after the next delete operation which created dead tuples, autovacuum will go into effect and do its job properly, and prevent the situation from recurring. Indeed autovacuum started working on some of the tables. At least one of these tables was one that I have trimmed up using CLUSTER. So I was watching that autovacuum process carefully. And then suddenly it was gone, after working for 20-odd hours. And I had even more dead tuples in pg_stat_user_tables for that table than before. Looking at the log, I get this: 2011-02-06 15:21:42 IST ERROR: canceling autovacuum task 2011-02-06 15:21:42 IST CONTEXT: automatic vacuum of table "reports.alerts.smsq__archive" Why? Postgres version is 8.3.11 Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Book recommendation?
As a result of my recent encounter with table bloat and other tuning issues I've been running into, I'm looking for a good resource for improving my tuning skills. My sysadmin ran into the following book: PostgreSQL 9.0 High Performance, by Gregory Smith, ISBN 184951030X http://amzn.com/184951030X Which covers versions 8.1 through 9. Any opinions on this book? Other suggestions? Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Adding more space, and a vacuum question.
On 31/01/2011, at 03:49, Craig Ringer wrote: > For approaches to possibly fixing your problem, see: > > http://www.depesz.com/index.php/2010/10/17/reduce-bloat-of-table-without-longexclusive-locks/ > > http://blog.endpoint.com/2010/09/reducing-bloat-without-locking.html I'm not quite sure what this will do for me. How will Postgresql know that there is free space in low-numbered pages, if the bloat has not been vacuumed off and is not in the fsm? -- חרות מעוז יוניסל פתרונות סלולריים מתקדמים ☎ 03-5181717 שלוחה 742
Re: [GENERAL] Adding more space, and a vacuum question.
On 30/01/2011, at 12:27, Craig Ringer wrote: > > OK, so you're pre-8.4 , which means you have the max_fsm settings to play > with. Have you seen any messages in the logs about the free space map (fsm)? > If your install didn't have a big enough fsm to keep track of deleted tuples, > you'd face massive table bloat that a regular vacuum couldn't fix. Ouch. You're absolutely right. There are messages about max_fsm_pages in the postgres log. It's currently set to 153600. According to the documentation, I can increase it up to 20. Will that even help? How do I find out how many I need to set it to? > > You also don't have the visibility map, which means that (auto)vacuum can't > skip bits of the tables it knows don't need vacuuming. Your vacuums will be > slower. > > Autovacuum improved significantly in both 8.4 and 9.0; consider an upgrade. I will consider it. Thank you. Herouth
Re: [GENERAL] Adding more space, and a vacuum question.
On 30/01/2011, at 13:03, Alban Hertroys wrote: > On 28 Jan 2011, at 22:12, Herouth Maoz wrote: > >> 2. That database has a few really huge tables. I think they are not being >> automatically vacuumed properly. In the past few days I've noticed a vacuum >> process on one of them which has been running since January 14th. >> Unfortunately, it never finished, because we were informed of a scheduled >> power down in our building yesterday, and had to shut down the machine. The >> questions are: >> >> a. Is it normal for vacuum processes to take two weeks? > > For a 200M record table that's definitely on the long side. It was probably > waiting on a lock by another transaction. In most cases that means that some > transaction was kept open for that duration. > If that transaction came into existence by accident, then vacuum should be > fine now that the server has restarted - that transaction is gone now. You > may want to keep an eye out for long-running transactions though, that's > usually a programming error - it's sometimes done deliberately, but it's > still a bad idea from the point of the database. Unless my eyes were deceiving me, this was not the case. Sure, there have been heavy transactions during that time (e.g. the daily backup of the database, and the daily inserts into other tables, which take a long time, and a few selects which I haven't been able to find an optimal index for). But this is the query I use to see these processes (ran from a superuser): SELECT usename, procpid, query_start, client_addr, client_port, current_query,waiting FROM pg_stat_activity WHERE query_start < now() - interval '3 seconds' AND xact_start is not null order by xact_start Any long transactions should be caught by it, but most of the time, all I see are vacuum workers. By the way, the auto vacuum on that table has started again - but only after more records were deleted from it. It has now been running since yesterday at 17:00. Here is the pg_stat_user_tables record for this table (which has also updated after the deletes): relid| 17806 schemaname | sms relname | billing__archive seq_scan | 9 seq_tup_read | 2053780855 idx_scan | 2553 idx_tup_fetch| 8052678 n_tup_ins| 11437874 n_tup_upd| 0 n_tup_del| 7987450 n_tup_hot_upd| 0 n_live_tup | 218890768 n_dead_tup | 33710378 last_vacuum | last_autovacuum | last_analyze | 2011-01-29 15:29:37.059176+02 last_autoanalyze | > > In older PG versions autovacuum could get stuck like that on large tables. It > keeps starting over trying to vacuum that same table, but never reaches the > end of it. Since it's only a single worker process (in those versions), it > also will never vacuum any tables beyond the table it got stuck on. How old? Mine is 8.3.11. > > If you don't delete or update tuples a lot, then the tables are probably just > that big. If you do delete/update them regularly, try if a normal vacuum will > shrink them enough (probably not) and if not, schedule a VACUUM FULL and a > REINDEX at some time the database isn't too busy. Both are quite heavy > operations that take exclusive locks on things (tables, indices). Yes, I do delete many tuples from that table. My mode of usage is like this: I have a small table called billing which receives new data every night. I want to keep that table small so that those nightly updates don't take an overly long time, because all data (several such tables) has to be ready in the database by the next morning. Therefore, once a week on the weekend, I move a week's worth of data to billing__archive (the table we are discussing), and delete a week's worth from its end. Now, the indexes on that table would make this impossible to do within the weekend, so what I do is drop all the indexes before I do the inserts, and then recreate them, and then do the deletes. What you are saying is that in this mode of operation, there's basically no hope that autovacuum will ever salvage the deleted records? Does removing and recreating the indexes have any effect on the vacuuming process? If a vacuum takes me several days (let alone over a week!) than a VACUUM FULL is out of the question. VACUUM FULL locks the table completely and that table is essential to our customer care. If push comes to shove, I think I'd rather dump that table, drop it, and restore it over the weekend, which I believe will be faster than a VACUUM FULL. One other important question: a tuple marked by VACUUM as reusable (not VACUUM FULL which restores it to the operating system) - can its space ever be used by another table, or can it only be used for new inserts into the same table? > >> d. After restarting
Re: [GENERAL] Adding more space, and a vacuum question.
בתאריך 29/01/11 13:57, ציטוט Craig Ringer: On 01/29/2011 05:12 AM, Herouth Maoz wrote: The machine has no additional room for internal disks. It is a recent purchase and not likely to be replaced any time soon. Newly acquired or not, it sounds like it isn't sized correctly for the load and needs an upgrade if it can't be shifted into a more suitable role and replaced. Sigh. Budget considerations, you know. Now, my position is that the best solution would be to add an external hard disk, via USB/firewire eSATA? Via a PCI or PCIe add-in SATA controller if there's no existing eSATA. Oh, yes, I forgot about eSATA. I meant basically a real local connection rather than network one. FireWire is usable for a database. USB is too ... kind of. Performance will be poor because of the high latency, CPU-heavy non-DMA access done by the USB stack. For something read-only, that might be OK. and use it for the archive tables. My sysadmin, on the other hand, wants to mount a storage machine remotely and use it for the extra tablespace, as the storage machine is a more reliable hardware. If you have iSCSI or ATA-over-Ethernet disk volumes you can mount, that might be a good idea. I'd personally avoid NFS or SMB. OK. That said, again if it's read-only you might be fine. Question is - if the read-only tablespace gets stuck/frozen, what happens to the read-write part of the database, which is absolutely essential to have in good responsive working order? a. Is it normal for vacuum processes to take two weeks? Define "really huge" and describe the hardware; without numbers it's hard to know. What version of Pg are you using? Pg 8.3.11. The tables have more than 200,000,000 records. About the hardware, I'm not entirely in the loop, but it has two dual-core Intel Xeon 5130 CPUs, 4G of memory, and its system disk (111G) is separate from the database disk (825G). The disks are hardware RAID, but I'm not sure which level, and I think they are 10,000 RPM but I could be wrong. Was it a standalone VACUUM or was it an autovacuum worker? Autovacuum worker. TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Adding more space, and a vacuum question.
Hello. We have two problems (which may actually be related...) 1. We are running at over 90% capacity of the disk at one of the servers - a report/data warehouse system. We have ran out of disk space several times. Now we need to make some file-archived data available on the database to support our legal team. This means two huge tables to be added to the database. The only solution that I see is to add more space by means of another tablespace. The two tables are static - after loading them and creating indexes they will not be changed. The machine has no additional room for internal disks. It is a recent purchase and not likely to be replaced any time soon. Now, my position is that the best solution would be to add an external hard disk, via USB/firewire, and use it for the archive tables. My sysadmin, on the other hand, wants to mount a storage machine remotely and use it for the extra tablespace, as the storage machine is a more reliable hardware. I think that remote mounted volumes are not a proper device for a database, as the network is subject to load and I've ran into frozen mounts in both NFS and SMB in the past. Never mind being slower. Which solution would you advise and which one of us is right? 2. That database has a few really huge tables. I think they are not being automatically vacuumed properly. In the past few days I've noticed a vacuum process on one of them which has been running since January 14th. Unfortunately, it never finished, because we were informed of a scheduled power down in our building yesterday, and had to shut down the machine. The questions are: a. Is it normal for vacuum processes to take two weeks? b. What happens if the vacuum process is stopped? Are the tuples partially recovered, or are they only recovered if the process completes properly? c. Is there anything I can do to make vacuums shorter? d. After restarting the server, all the data in pg_stat_user_tables seem to have been reset. What does this mean and how does this affect vacuum scheduling? Thank you in advance, Herouth
Re: [GENERAL] auto vacuum
ציטוט Bill Moran: In response to Herouth Maoz : Did I understand the original problem correctly? I thought you were saying that _lack_ of analyzing was causing performance issues, and that running vacuum analyze was taking too long and causing the interval between analyze runs to be too long. If that is the case, then I still think manually scheduling vacuum and analyze to run in separate threads is the best approach. Yes, I get the worst impact if autovacuum decides to do a vacuum analyze rather than a separate vacuum and a separate analyze. However, normal vacuum does cause a slowdown, though not as much as vacuum analyze. If the problem is that overall performance slows too much when vacuum is running, then you'll probably have to get more/faster hardware. Vacuum has to run occasionally or your table will bloat. Bloated tables perform lousy and waste a lot of space, and a table that is getting updates and inserts without vacuuming will grow without bound, even if you delete records. It's kind of like the trash bin on many desktop OSes ... when you DELETE a record from the DB, it goes into the trash bin, when you run VACUUM, the trash is emptied (Yes, I know that's not _exactly_ how vacuum works, but I'm just drawing a parallel here) I understand. Assuming that I have enough disk space, and I vacuum once a day instead of every 20 minutes. Does that cause deterioration in performance? Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] auto vacuum
First, I'd like to thank Bill and Alvaro as well as you for your replies. Quoting Tom Lane: Hmm. Given the churn rate on the table, I'm having a very hard time believing that you don't need to vacuum it pretty dang often. Maybe the direction you need to be moving is to persuade autovac to vacuum it *more* often, not less often, so that the time needed to finish each vacuum is small enough. Other than reclaiming disk space, is there any advantage to vacuum? Is a vacuumed table more efficient? So far, every time it vacuums - which is around every 15-20 minutes under load conditions - it slows down processing. I think perhaps Bill's suggestion of just scheduling the vacuums myself (e.g. 1-2am, off peak) coupled with cost-based vacuuming might be a good answer? Unless I'm missing an important point about vacuuming. Alvaro and Bill both suggested scheduling analyzes on a minute-by-minute cron. Would this be no different than automatic analyze? No extra overhead for connection, perhaps? Thanks, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] auto vacuum
Hi all. We had a crisis this week that was resolved by tuning pg_autovacuum for a particular table. The table is supposed to contain a small number of items at any given point in time (typically around 10,000-30,000). The items are inserted when we send out a message, and are selected, then deleted when a reply to the message arrives. This may be done at a rather high rate - sometimes a thousand a minute or around that. We found out that the table's response depends on the rate of ANALYZE being performed. We have tuned the values in pg_autovacuum so that we have around one analyze per minute. What is bothering me is that sometimes the auto vacuum daemon decides to perform a vacuum analyze rather than just analyze. If it just does a vacuum independent of the analyze, we don't see much impact on performance. But if it does vacuum analyze, it means that until vacuum is over, it doesn't do another analyze, and this may take about five minutes, in which our performance under load conditions might deteriorate. Is there any way to cause pg_autovacuum not to choose vacuum analyze? I thought of changing the vacuum frequency to be rare - but then it might take even longer to vacuum, and if a long vacuum analyze falls on a high load time, although the chances are smaller, the risk is higher. We can't afford a slowdown in that table. # select * from pg_stat_user_tables where relname = 'transient'; -[ RECORD 1 ]+-- relid| 17866 schemaname | public relname | transient seq_scan | 49633 seq_tup_read | 1388557648 idx_scan | 9200950 idx_tup_fetch| 9960245 n_tup_ins| 6572067 n_tup_upd| 0 n_tup_del| 6466085 n_tup_hot_upd| 0 n_live_tup | 81060 n_dead_tup | 10097 last_vacuum | 2010-04-11 00:29:52.266617+03 last_autovacuum | 2010-04-14 11:47:43.13062+03 last_analyze | 2010-04-11 00:29:52.266617+03 last_autoanalyze | 2010-04-14 12:04:39.090055+03 Thank you, Herouth
Re: [GENERAL] stopping processes, preventing connections
? Scott Marlowe: On Sat, Mar 20, 2010 at 11:44 AM, Herouth Maoz wrote: The server version is 8.3.1. Migration to a higher version might be difficult as far as policies go, if there isn't a supported debian package for it, but if you can point out a version where this has been fixed I might be able to persuade my boss and sysadmin. Most of the time it is more dangerous to NOT update PostgreSQL to the latest minor point version than to stay on an older minor point version. The occasions when a minor point upgrade come out that is dangerous are rare, and the next minor point version to fix it shows up the next day while the broken one is pulled. I think that's happened 1 or 2 times during the time I've been using postgresql. So, if it's 48 hours old and no alarm bells have gone off that it's being pulled and replaced, a pg update is the right thing to do. Backup beforehand, etc. The danger of a change making your application stop are very low, while the danger of leaving some unpatched bit of nastiness in the backend is much greater a possible problem. I.e. data loss / corruption, things like that. And something as mature as 8.3 is now shouldn't be running in production missing two years of patches. Start with the release notes for 8.3.2 and move forward and see if anything there looks like a problem for your app. Behaviour changing changes rarely get into production releases, they get saved for the next major version. If they do they are well noted in the release notes. The problem is not so much danger in upgrading, but the fact that doing so without using the system's usual security/bugfix update path means non-standard work for the sysadmin, meaning he has to upgrade every package on the system using a different upgrade method, being notified about it from a different source, and needing to check each one in different conditions, which makes his work impossible. So the policy so far has been "Use the packages available through debian". So I'll need to check if there is an upgrade available through that path - and the question is whether it's worthwhile (i.e. whether the bug in question has indeed been fixed). Herouth
Re: [GENERAL] stopping processes, preventing connections
quoth Greg Smith: Herouth Maoz wrote: Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the client side can disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposed to cause a backend to hang, I would assume. As a general commentary on this area, in most cases where I've seen an unkillable backend, which usually becomes noticed when the server won't shutdown, have resulted from bad socket behavior. It's really a tricky area to get right, and presuming the database backends will be robust in the case of every possible weird OS behavior is hard to guarantee. However, if you can repeatably get the server into this bad state at will, it may be worth spending some more time digging into this in hopes there is something valuable to learn about your situation that can improve the keepalive handling on the server side. Did you mention your PostgreSQL server version and platform? I didn't see the exact code path you're stuck in during a quick look at the code involved (using a snapshot of recent development), which makes me wonder if this isn't already a resolved problem in a newer version. The server version is 8.3.1. Migration to a higher version might be difficult as far as policies go, if there isn't a supported debian package for it, but if you can point out a version where this has been fixed I might be able to persuade my boss and sysadmin. Thank you for referring me to that entry in the FAQ. By the way, the situation repeated itself today as well. Thanks, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] stopping processes, preventing connections
On Mar 17, 2010, at 14:56 , Craig Ringer wrote: > On 17/03/2010 8:43 PM, Herouth Maoz wrote: >> >> On Mar 17, 2010, at 13:34 , Craig Ringer wrote: >> >>> On 17/03/2010 6:32 PM, Herouth Maoz wrote: >>>> >>>> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: >>>> >>>>> Though next time you see a query which doesn't respond to >>>>> pg_cancel_backend(), try gathering information about the query and >>>>> what the backend is doing; either you're doing something unusual (e.g. >>>>> an app is restarting the query automatically after getting canceled) >>>>> or perhaps you've stumbled on a bug in Postgres. >>>> >>>> Hi. A long time has passed since you made that suggestion, but today we >>>> stumbled again on a query that wouldn't be canceled. Not only does it >>>> not respond to pg_cancel_backend(), it also doesn't respond to kill >>>> -SIGTERM. >>> >>> Interesting. If you attach gdb to the backend and run "backtrace", what's >>> the output? >> >> (gdb) backtrace >> #0 0x8dfcb410 in ?? () >> #1 0xbff10a28 in ?? () >> #2 0x083b1bf4 in ?? () >> #3 0xbff10a00 in ?? () >> #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 >> #5 0x08195d54 in secure_write () >> #6 0x0819dc7e in pq_setkeepalivesidle () >> #7 0x0819ddd5 in pq_flush () >> #8 0x0819de3d in pq_putmessage () >> #9 0x0819fa63 in pq_endmessage () >> #10 0x08086dcb in printtup_create_DR () >> #11 0x08178dc4 in ExecutorRun () >> #12 0x08222326 in PostgresMain () >> #13 0x082232c0 in PortalRun () >> #14 0x0821e27d in pg_parse_query () >> #15 0x08220056 in PostgresMain () >> #16 0x081ef77f in ClosePostmasterPorts () >> #17 0x081f0731 in PostmasterMain () >> #18 0x081a0484 in main () > > OK, so it seems to be stuck sending data down a socket. The fact that strace > isn't reporting any new system calls suggests the backend is just blocked on > that send() call and isn't doing any work. > > Is there any chance the client has disconnected/disappeared? Yes, certainly. In fact, I mentioned in the past that the product we use for our reports, which is an application built on top of Crystal Reports, when told to cancel a report or when a report times out, instead of telling Crystal to cancel queries properly, simply kills Crystal's processes on the Windows machine side - which leaves us with orphan backends. It's stupid, but it's not under our control. But most of the time the backends respond to cancel requests. Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the client side can disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposed to cause a backend to hang, I would assume. Is there anything I can do about it? Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] stopping processes, preventing connections
On Mar 17, 2010, at 13:34 , Craig Ringer wrote: > On 17/03/2010 6:32 PM, Herouth Maoz wrote: >> >> On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: >> >>> Though next time you see a query which doesn't respond to >>> pg_cancel_backend(), try gathering information about the query and >>> what the backend is doing; either you're doing something unusual (e.g. >>> an app is restarting the query automatically after getting canceled) >>> or perhaps you've stumbled on a bug in Postgres. >> >> Hi. A long time has passed since you made that suggestion, but today we >> stumbled again on a query that wouldn't be canceled. Not only does it >> not respond to pg_cancel_backend(), it also doesn't respond to kill >> -SIGTERM. > > Interesting. If you attach gdb to the backend and run "backtrace", what's the > output? (gdb) backtrace #0 0x8dfcb410 in ?? () #1 0xbff10a28 in ?? () #2 0x083b1bf4 in ?? () #3 0xbff10a00 in ?? () #4 0x8db98361 in send () from /lib/tls/i686/cmov/libc.so.6 #5 0x08195d54 in secure_write () #6 0x0819dc7e in pq_setkeepalivesidle () #7 0x0819ddd5 in pq_flush () #8 0x0819de3d in pq_putmessage () #9 0x0819fa63 in pq_endmessage () #10 0x08086dcb in printtup_create_DR () #11 0x08178dc4 in ExecutorRun () #12 0x08222326 in PostgresMain () #13 0x082232c0 in PortalRun () #14 0x0821e27d in pg_parse_query () #15 0x08220056 in PostgresMain () #16 0x081ef77f in ClosePostmasterPorts () #17 0x081f0731 in PostmasterMain () #18 0x081a0484 in main () > > If you strace the backend, what do you see? All I get is this: send(9, "00:00\0\0\0\0011\377\377\377\377\0\0\0\0011\0\0\0\0041"..., 1541, 0 I waited about 20 minutes after receiving that, but nothing further was output. Thank you, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] stopping processes, preventing connections
On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: > Though next time you see a query which doesn't respond to > pg_cancel_backend(), try gathering information about the query and what the > backend is doing; either you're doing something unusual (e.g. an app is > restarting the query automatically after getting canceled) or perhaps you've > stumbled on a bug in Postgres. Hi. A long time has passed since you made that suggestion, but today we stumbled again on a query that wouldn't be canceled. Not only does it not respond to pg_cancel_backend(), it also doesn't respond to kill -SIGTERM. The query is: select date_trunc('day',rb.time_stamp),count(*),rb.category,channels.channel_id,channels.name as channel,platforms.platform_id, platforms.name,rb.operator,item,delivered,msisdn from public.rb__view as rb,channels,platforms where rb.channel_id=channels.channel_id and rb.platform_id=platforms.platform_id and rb.time_stamp>='2010-03-14'::date and rb.time_stamp<'2010-03-14'::date + interval '1 day' and platforms.platform_id=262 and channels.channel_id=1 group by date_trunc('day',rb.time_stamp),rb.category,channels.channel_id,channel,operator,item,delivered,msisdn,platforms.platform_id, platforms.name This is nothing too fancy - just an aggregate with group by. And the application on the other side is Crystal Reports, connecting using ODBC. I don't believe the application does anything like restart after cancel, because most of our queries can easily be cancelled and I don't think Crystal has different behaviors for different queries. rb__view is a union all between two tables (rb and rb__archive) which have the same schema - one holds data from the past 7 weeks and the other holds older data. The channels and platforms tables are basically lookup tables. The fields item,delivered and msisdn all belong to rb__view. There is nothing in the PostgreSQL log. If it helps any, this is the EXPLAIN output for the above query. Note that at this time, the query has been running for over a hour and a half. HashAggregate (cost=221312.77..221318.08 rows=354 width=94) -> Nested Loop (cost=8078.83..221215.50 rows=3537 width=94) -> Seq Scan on channels (cost=0.00..3.81 rows=1 width=16) Filter: (channel_id = 1::numeric) -> Nested Loop (cost=8078.83..221167.48 rows=3537 width=85) -> Index Scan using platforms_pkey on platforms (cost=0.00..6.27 rows=1 width=19) Index Cond: (platform_id = 262::numeric) -> Append (cost=8078.83..221125.84 rows=3537 width=73) -> Bitmap Heap Scan on rb (cost=8078.83..221115.42 rows=3536 width=72) Recheck Cond: ((public.rb.time_stamp >= '2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone)) Filter: ((public.rb.channel_id = 1::numeric) AND (public.rb.platform_id = 262::numeric)) -> Bitmap Index Scan on rb_timestamp_ind (cost=0.00..8077.94 rows=104502 width=0) Index Cond: ((public.rb.time_stamp >= '2010-03-14'::date) AND (public.rb.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone)) -> Index Scan using rba_timestamp_ind on rb__archive (cost=0.00..10.42 rows=1 width=73) Index Cond: ((rb__archive.time_stamp >= '2010-03-14'::date) AND (rb__archive.time_stamp < '2010-03-15 00:00:00'::timestamp without time zone)) Filter: ((rb__archive.channel_id = 1::numeric) AND (rb__archive.platform_id = 262::numeric)) I don't know what other information I may glean for this. Any thoughts? Thank you, Herouth
Re: [GENERAL] stopping processes, preventing connections
On Mar 3, 2010, at 18:01 , Josh Kupershmidt wrote: > > On Wed, Mar 3, 2010 at 8:31 AM, Herouth Maoz wrote: > > First, the easy part - regarding allowing/disallowing queries. Is it possible > to GRANT or REVOKE access to tables based on the originating IP? > > I'd suggest separating out access to your tables by roles, and then > restricting those roles to certain IP ranges in pg_hba.conf. Thank you. I guess I will go for something simple - I'll give the lady in charge of the reports machine a new user/password to use, and revoke that user's access. I was hoping to avoid her needing to change settings in Windows, but it seems to be the easiest way. > > Second, and the more complicated one - what do I do about rogue queries that > are running when my process starts? Today we had a query that ran since > yesterday. I called pg_cancel_backend() on it several times and waited for > almost two hours - to no avail. Eventually I had to ask our sysadmin to > shutdown PostgreSQL, which took some five minutes, but eventually worked. Is > there a way to do the same thing to a single process without shutting down > the whole server, and without causing any harm to the database or memory > corruption? Something I can call from within SQL? I run the nightly script > from a linux user which is not "postgres", so I'd prefer a way that doesn't > require using "kill". > > > On 8.4, you can use pg_terminate_backend(), which sends a SIGTERM instead of > pg_cancel_backend's SIGINT. If you're not on 8.4, you can just do this > manually with a "kill -SIGTERM backend_pid". If that doesn't work either, you > might have to resort to a "kill -SIGKILL backend_pid". Killing a single > backend should be much better for you than restarting Postgres entirely. > These operations shouldn't result in database corruption. > > You have to be database superuser to use pg_cancel_backend() or > pg_terminate_backend(), or have a shell login as the database user to use > "kill". No way around that for now. Unfortunately, we only have 8.3 - is SIGTERM safe on 8.3? I guess I'll have to sudo or use local ssh. > > Though next time you see a query which doesn't respond to > pg_cancel_backend(), try gathering information about the query and what the > backend is doing; either you're doing something unusual (e.g. an app is > restarting the query automatically after getting canceled) or perhaps you've > stumbled on a bug in Postgres. I'd appreciate it if you tell me what to look for. It was running a join on several tables, but nothing too complicated. It may be that the query is not optimized (one of the tables is not indexed properly) but it still should respond to cancel - shouldn't it? Thank you very much, Herouth
[GENERAL] stopping processes, preventing connections
Hi. I'm continuing on with the problems I have in our reports/data warehouse system. Basically, the system brings in tables from our various production systems (sybase, postgresql, mssql, different servers) every night. Some tables are brought in whole, and some are brought in based on a date field, and only the relevant interval is imported. For tables which are brought whole, I first truncate the local table, then copy in the up-to-date data. For the ones that are brought partially, I delete partially first, and then copy in the same way. The trouble is that sometimes there is a stray select which has been initiated and then abandoned (without cancellation) by the crystal reports system. When these queries happen to last into the night, they lock some of the tables which are supposed to be truncated. Then the whole process hangs until the query quits or dies, which, we have seen in the past, can take several hours sometimes. What I want to do is write a script that kills any queries or connections from the crystal system, and then prevents new queries from being ran, until I finish loading all the tables, at which point I want to allow queries again. First, the easy part - regarding allowing/disallowing queries. Is it possible to GRANT or REVOKE access to tables based on the originating IP? Second, and the more complicated one - what do I do about rogue queries that are running when my process starts? Today we had a query that ran since yesterday. I called pg_cancel_backend() on it several times and waited for almost two hours - to no avail. Eventually I had to ask our sysadmin to shutdown PostgreSQL, which took some five minutes, but eventually worked. Is there a way to do the same thing to a single process without shutting down the whole server, and without causing any harm to the database or memory corruption? Something I can call from within SQL? I run the nightly script from a linux user which is not "postgres", so I'd prefer a way that doesn't require using "kill". Thank you, Herouth Maoz
Re: [GENERAL] Questions about connection clean-up and "invalid page header"
Greg Stark wrote: On Mon, Jan 25, 2010 at 11:37 AM, Herouth Maoz wrote: The tcp_keepalive setting would only come into play if the remote machine crashed or was disconnected from the network. That's the situation I'm having, so it's OK. Crystal, being a Windows application, obviously runs on a different server than the database itself, so the connection between them is TCP/IP, not Unix domain sockets. The unix socket api is used for both unix domain sockets and internet domain sockets. The point is that in the api there's no way to find out about a connection the other side has closed except for when you write or read from it or when you explicitly check. And furthermore, that was exactly the problem as I described it - the fact that the third party software, instead of somehow instructing Crystal to send a cancel request to PostgreSQL, instead just kills the client process on the Windows side. Killing the client process doesn't mean the machine has crashed or been disconnected from the network. I'm assuming Crystal isn't crashing the machine just to stop the report... And even if it did and tcp_keepalives kicked in the server *still* wouldn't notice until it checked or tried to read or write to that socket. Well, I assume by the fact that eventually I get an "Unexpected end of file" message for those queries, that something does go in and check them. Do you have any suggestion as to how to cause the postgresql server to do so earlier? Herouth
Re: [GENERAL] Questions about connection clean-up and "invalid page header"
Greg Stark wrote: On Mon, Jan 25, 2010 at 8:15 AM, Scott Marlowe wrote: Is there a parameter to set in the configuration or some other means to shorten the time before an abandoned backend's query is cancelled? You can shorten the tcp_keepalive settings so that dead connections get detected faster. This won't help. The TCP connection is already being closed (or I think only half-closed). The problem is that in the Unix socket API you don't find out about that unless you check or try to read or write to it. The tcp_keepalive setting would only come into play if the remote machine crashed or was disconnected from the network. That's the situation I'm having, so it's OK. Crystal, being a Windows application, obviously runs on a different server than the database itself, so the connection between them is TCP/IP, not Unix domain sockets. And furthermore, that was exactly the problem as I described it - the fact that the third party software, instead of somehow instructing Crystal to send a cancel request to PostgreSQL, instead just kills the client process on the Windows side. Herouth
Re: [GENERAL] Questions about connection clean-up and "invalid page header"
Scott Marlowe wrote: You can shorten the tcp_keepalive settings so that dead connections get detected faster. Thanks, I'll ask my sysadmin to do that. Might be, but not very likely. I and many others run pgsql in production environments where it handles thousands of updates / inserts per minute with no corruption. We run on server class hardware with ECC memory and large RAID arrays with no corruption. Someone pointed out to me, though, that comparing data warehouse systems to production systems is like Apples and Oranges - we also have a production system that, as you say, makes millions of inserts and updates per hour. It works very well with PostgreSQL - a lot better than with Sybase with which we worked previously. But the reports system on which I work makes bulk inserts using calculations based on complicated joins and each transaction is long and memory-consuming, as opposed to the production system, where each transaction takes a few milliseconds and is cleared immediately. So far this only happened to me in the development server, and if it really is a matter of hardware, I'm not worried. What I am worried is if there really is some sort of bug that may carry to our production reports system. Have you run something as simple as memtest86+ on your machine to see if it's got bad memory? I'll tell my sysadmin to do that. Thank you. We are currently using PostgreSQL v. 8.3.1 on the server side. You should really update to the latest 8.3.x version (around 8.3.8 or so). It's simple and easy, and it's possible you've hit a bug in an older version of 8.3. OK, I'll also try to get that done. Thanks for your help, Herouth
[GENERAL] Questions about connection clean-up and "invalid page header"
Hi Everybody. I have two questions. 1. We have a system that is accessed by Crystal reports which is in turned controlled by another (3rd party) system. Now, when a report takes too long or the user cancels it, it doesn't send a cancel request to Postgres. It just kills the Crystal process that works on it. As a result, the query is left alive on the Postgres backend. Eventually I get the message "Unexpected End of file" and the query is cancelled. But this doesn't happen soon enough for me - these are usually very heavy queries, and I'd like them to be cleaned up as soon as possible if the client connection has ended. Is there a parameter to set in the configuration or some other means to shorten the time before an abandoned backend's query is cancelled? 2. I get the following message in my development database: vacuumdb: vacuuming of database "reports" failed: ERROR: invalid page header in block 6200 of relation "rb" I had this already a couple of months ago. Looking around the web, I saw this error is supposed to indicate a hardware error. I informed my sysadmin, but since this is just the dev system and the data was not important, I did a TRUNCATE TABLE on the "rb" relation, and the errors stopped... But now the error is back, and I'm a bit suspicious. If this is a hardware issue, it's rather suspicious that it returned in the exact same relation after I did a "truncate table". I have many other relations in the system, ones that fill up a lot faster. So I suspect this might be a PostgreSQL issue after all. What can I do about this? We are currently using PostgreSQL v. 8.3.1 on the server side. TIA, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow update
Alban Hertroys wrote: > On Feb 9, 2009, at 2:07 PM, Grzegorz Jaśkiewicz wrote: > >> On Mon, Feb 9, 2009 at 12:50 PM, Herouth Maoz >> wrote: >>> I hope someone can clue me in based on the results of explain analyze. >> >> Did you have a chance to run vmstat on it, and post it here ? Maybe - >> if db resides on the same disc with everything else, something >> (ab)uses that much io, and it has to wait. >> Also, I don't know - but personaly I didn't like the line in explain: >> >> -> Bitmap Index Scan on billing_msisdn_sme_reference >> (cost=0.00..24.70 rows=389 width=0) (actual time=2 >> 1.418..21.418 rows=252 loops=151332) >>Index Cond: ((b.msisdn)::text = >> (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substrin >> g"((rb.msisdn)::text, 2))) >> >> But the cost is next to none, so that's not it. > > > Actually, it's inside a nested loop and if I read correctly it gets > looped over 151332 times. That means it takes 151332 * (21.418 - > 1.418) = 3026640 ms, which is almost 12% of the total time. > > The biggie seems to be the bitmap heap scan on rb though. The row > estimates for that one are way off (estimated 549 rows vs actual 151332). > > Alban Hertroys > To be quite honest, I'm not sure exactly how to read this plan, and what the row values mean. The issue here is that sometimes the query works in reasonable time, and sometimes it takes half a day. Of course, this may be because the data size is different, but I don't know which part of the plan tells me that. What do rows vs. loops signify? How can the estimate be so far off if I'm running analyze on all the tables right after I make any big updates to them? I find it hard to believe that the problem is with the complex comparison caused by the different formats of the fields in rb and in billing. This should add a constant multiplier to the time it takes to run the query, but not cause the query to run one time in 5 minutes, and the next day in 12 hours! Thanks for the assistance. Herouth Here is the plan collected tonight, which took a reasonable amount of time (5 minutes). It seems to be a totally different plan, isn't it?: QUERY PLAN - Merge Join (cost=1157750.08..1167132.31 rows=1 width=210) (actual time=238247.983..239980.264 rows=111676 loops=1) Merge Cond: ((rb.reference = b.user_reference) AND ((rb.sms_user)::text = (b.user_id)::text) AND "substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2))) = (b.msisdn)::text)) -> Sort (cost=31137.76..31141.31 rows=1423 width=198) (actual time=117858.431..117932.544 rows=111676 loops=1) Sort Key: rb.reference, rb.sms_user, ((("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2))) Sort Method: quicksort Memory: 30941kB -> Bitmap Heap Scan on rb (cost=26968.95..31063.23 rows=1423 width=198) (actual time=113615.187..116935.502 rows=111676 loops=1) Recheck Cond: ((delivered = 0) AND (time_stamp >= '2009-02-12 00:00:00'::timestamp without time zone)) Filter: ((NOT mo_billed) AND (system_id <> 6)) -> BitmapAnd (cost=26968.95..26968.95 rows=1423 width=0) (actual time=113454.761..113454.761 rows=0 loops=1) -> Bitmap Index Scan on rb_delivered_ind (cost=0.00..2522.46 rows=69896 width=0) (actual time=9358.397..9358.397 rows=150651 loops=1) Index Cond: (delivered = 0) -> Bitmap Index Scan on rb_timestamp_ind (cost=0.00..24445.53 rows=213475 width=0) (actual time=104091.620..104091.620 rows=303308 loops=1) Index Cond: (time_stamp >= '2009-02-12 00:00:00'::timestamp without time zone) -> Sort (cost=1117952.26..1120779.49 rows=1130889 width=50) (actual time=119485.709..120263.045 rows=756135 loops=1) Sort Key: b.user_reference, b.user_id, b.msisdn Sort Method: external sort Disk: 60976kB -> Bitmap Heap Scan on billing b (cost=36754.98..1004246.88 rows=1130889 width=50) (actual time=24409.448..101034.765 rows=896474 loops=1) Recheck Cond: (time_arrived >= '2009-02-12 00:00:00'::timestamp without time zone) -> Bitmap Index Scan on billing_time_arrived (cost=0.00..36472.26 rows=1130889 width=0) (actual time=23936.245..23936.245 rows=1166881 loops=1) Index Cond: (time_arrived >= '2009-02-12 00:00:00'::timestamp without time zone) Total runtime: 307958.152 ms (21 rows) -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow update
Grzegorz Jaśkiewicz wrote: > On Mon, Feb 9, 2009 at 12:50 PM, Herouth Maoz wrote: > >> I hope someone can clue me in based on the results of explain analyze. >> > > Did you have a chance to run vmstat on it, and post it here ? Maybe - > if db resides on the same disc with everything else, something > (ab)uses that much io, and it has to wait. I haven't ran it, but it's worth mentioning that the data is on a separate device than the system/users. Also, the system is used only for the reporting system. Other than PostgreSQL and the cron job that runs this query, nothing runs on the system at night. Reports start being produced around 7:00 AM which should give the query ample time to finish (almost 3 hours) before any contention starts. Herouth
Re: [GENERAL] Slow update
Filip Rembiałkowski wrote: > > 2009/1/21 Herouth Maoz <mailto:hero...@unicell.co.il>> > > Hello. > > I have a daily process that synchronizes our reports database from > our production databases. In the past few days, it happened a > couple of times that an update query took around 7-8 hours to > complete, which seems a bit excessive. This is the query: > > UPDATE rb > SET service = b.service, > status = b.status, > has_notification = gateway_id NOT IN (4,101,102), > operator = COALESCE( > b.actual_target_network_id, > b.requested_target_network_id > ) > FROM sms.billing b > WHERE b.time_arrived >= :date_start > AND rb.time_stamp >= :date_start > AND rb.delivered = 0 > AND rb.sms_user = b.user_id > AND rb.reference = b.user_reference > AND OVERLAY( rb.msisdn placing '972' from 1 for 1 ) = b.msisdn > AND NOT mo_billed > AND system_id <> 6 -- Exclude Corporate, as it aleady has > service/status > ; > > The variable ":date_start" is set to a date 3 days ago. > > I ran explain for this query and it gave me this: > > > > -- > Nested Loop (cost=21567.12..854759.82 rows=1 width=210) >Join Filter: (((rb.sms_user)::text = (b.user_id)::text) AND > (rb.reference = b.user_reference)) >-> Bitmap Heap Scan on rb (cost=21546.02..23946.16 rows=819 > width=198) > Recheck Cond: ((delivered = 0) AND (time_stamp >= > '2009-01-18 00:00:00'::timestamp without time zone) AND > (time_stamp < '2009-01-21 00:00:00'::timestamp without time zone)) > Filter: ((NOT mo_billed) AND (system_id <> 6)) > -> BitmapAnd (cost=21546.02..21546.02 rows=819 width=0) >-> Bitmap Index Scan on rb_delivered_ind > (cost=0.00..1419.99 rows=45768 width=0) > Index Cond: (delivered = 0) >-> Bitmap Index Scan on rb_timestamp_ind > (cost=0.00..20125.37 rows=188994 width=0) > Index Cond: ((time_stamp >= '2009-01-18 > 00:00:00'::timestamp without time zone) AND (time_stamp < > '2009-01-21 00:00:00'::timestamp without time zone)) >-> Bitmap Heap Scan on billing b (cost=21.10..1004.77 > rows=351 width=49) > Recheck Cond: ((b.msisdn)::text = > (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || > "substring"((rb.msisdn)::text, 2))) > Filter: ((b.time_arrived >= '2009-01-18 > 00:00:00'::timestamp without time zone) AND (b.time_arrived < > '2009-01-21 00:00:00'::timestamp without time zone)) > -> Bitmap Index Scan on billing_msisdn_sme_reference > (cost=0.00..21.10 rows=351 width=0) >Index Cond: ((b.msisdn)::text = > (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || > "substring"((rb.msisdn)::text, 2))) > > I'm not an expert on reading plans, but it seems to me that it > uses indices on both tables that participate in this query, so it > shouldn't take such a long time. > > The number of records in the table rb for the past three days is > 386833. On the sms.billing table it seems to select the index on > the msisdn and sme_reference fields and use it partially (only > using the msisdn field). Looking at that table, the frequency of > each value in the msisdn field is at most 17678 for the current > data, where mostly it's a couple of thousands. How can this take > so long? > > > 1. which postgres version? > 2. can you post results of EXPLAIN ANALYZE (please note it actually > executes the query)? I'm sorry it took some time to answer these questions - as I explained, I needed an opportunity to make the change in our production machine since the data in the development machine wouldn't do at all. So the answer is PostgreSQL v. 8.3.1. The output of explain analyze is at the end of this message. Note that the run started at 04:20:50, and finished at 11:29:30. Also, a full vacuum was ran on the entire database a day before, and I run analyze on each table whenever there is a bulk insert or update to it - and the only way data comes in is in bulks. Could the delay have
Re: [GENERAL] Slow update
Grzegorz Jaśkiewicz wrote: > On Wed, Jan 21, 2009 at 12:55 PM, Herouth Maoz wrote: > >> Well, if it executes the query it's a problem. I might be able to do so >> during the weekend, when I can play with the scripts and get away with >> failures, but of course there is less data in the tables then. >> >> > > you should seirously think about having test machine > > I have a test machine - but the data in there is test data, and it's a slower machine. A testing environment is good for development, but can hardly be used to really simulate the production machine for performance. Herouth
Re: [GENERAL] Slow update
Filip Rembiałkowski wrote: > > 1. which postgres version? 8.3.1 > 2. can you post results of EXPLAIN ANALYZE (please note it actually > executes the query)? > Well, if it executes the query it's a problem. I might be able to do so during the weekend, when I can play with the scripts and get away with failures, but of course there is less data in the tables then. Thanks, Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Slow update
Marc Mamin wrote: > Hello, > > - did you vacuum your tables recently ? > > - What I miss in your query is a check for the rows that do not need > to be udated: > > AND NOT (service = b.service >AND status = b.status > AND has_notification = gateway_id NOT IN (4,101,102) > AND operator = COALESCE( b.actual_target_network_id, > b.requested_target_network_id ) > > > depending on the fraction of rows that are already up to date, the > might fasten your process quite a lot... I don't see why it would. As far as I know, the high saving in update time is done by using the indices. All the other conditions that are not on indices are all checked using a sequential scan on the rows that were brought from the index, so adding more conditions wouldn't make this a lot faster - maybe even slower because more comparisons are made. In any case, the logic of the database is that the records that have delivered = 0 are always a subset of the records that are changed in this query, so querying on delivered=0 - which is an indexed query - actually make the above redundant. Thanks for your response, Herouth
[GENERAL] Slow update
Hello. I have a daily process that synchronizes our reports database from our production databases. In the past few days, it happened a couple of times that an update query took around 7-8 hours to complete, which seems a bit excessive. This is the query: UPDATE rb SET service = b.service, status = b.status, has_notification = gateway_id NOT IN (4,101,102), operator = COALESCE( b.actual_target_network_id, b.requested_target_network_id ) FROM sms.billing b WHERE b.time_arrived >= :date_start AND rb.time_stamp >= :date_start AND rb.delivered = 0 AND rb.sms_user = b.user_id AND rb.reference = b.user_reference AND OVERLAY( rb.msisdn placing '972' from 1 for 1 ) = b.msisdn AND NOT mo_billed AND system_id <> 6 -- Exclude Corporate, as it aleady has service/status ; The variable ":date_start" is set to a date 3 days ago. I ran explain for this query and it gave me this: -- Nested Loop (cost=21567.12..854759.82 rows=1 width=210) Join Filter: (((rb.sms_user)::text = (b.user_id)::text) AND (rb.reference = b.user_reference)) -> Bitmap Heap Scan on rb (cost=21546.02..23946.16 rows=819 width=198) Recheck Cond: ((delivered = 0) AND (time_stamp >= '2009-01-18 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 00:00:00'::timestamp without time zone)) Filter: ((NOT mo_billed) AND (system_id <> 6)) -> BitmapAnd (cost=21546.02..21546.02 rows=819 width=0) -> Bitmap Index Scan on rb_delivered_ind (cost=0.00..1419.99 rows=45768 width=0) Index Cond: (delivered = 0) -> Bitmap Index Scan on rb_timestamp_ind (cost=0.00..20125.37 rows=188994 width=0) Index Cond: ((time_stamp >= '2009-01-18 00:00:00'::timestamp without time zone) AND (time_stamp < '2009-01-21 00:00:00'::timestamp without time zone)) -> Bitmap Heap Scan on billing b (cost=21.10..1004.77 rows=351 width=49) Recheck Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2))) Filter: ((b.time_arrived >= '2009-01-18 00:00:00'::timestamp without time zone) AND (b.time_arrived < '2009-01-21 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on billing_msisdn_sme_reference (cost=0.00..21.10 rows=351 width=0) Index Cond: ((b.msisdn)::text = (("substring"((rb.msisdn)::text, 1, 0) || '972'::text) || "substring"((rb.msisdn)::text, 2))) I'm not an expert on reading plans, but it seems to me that it uses indices on both tables that participate in this query, so it shouldn't take such a long time. The number of records in the table rb for the past three days is 386833. On the sms.billing table it seems to select the index on the msisdn and sme_reference fields and use it partially (only using the msisdn field). Looking at that table, the frequency of each value in the msisdn field is at most 17678 for the current data, where mostly it's a couple of thousands. How can this take so long? Thanks, Herouth
Re: [GENERAL] Copy/delete issue
Adrian Klaver wrote: > On Sunday 21 December 2008 1:49:18 am Herouth Maoz wrote: > >> Adrian Klaver wrote: >> >>> >>> >>> Are you sure the problem is not in "$datefield" = "*" . That the script >>> that formats the data file is not correctly adding "*" to the right file. >>> Seems almost like sometimes the second CMD is being run against the table >>> that the first CMD should be run on. In other words it is not doing a >>> complete delete , but a date based one, and you then import duplicate >>> records. >>> >> Thanks for your reply. The file containing the tables list is static - >> it doesn't change from one run to the next (unless I edit it personally). >> >> Herouth >> > > Well something is not static :) You mentioned this happens only with one > table. Have you tried running your procedure against that table only? Well, every time this happens, I re-run the procedure, with all the lines in the data files up to the given table deleted. And it works. Then I restore the original data file. And the next day it works. It only happens once in a while. > Just > because a DELETE did not error does not mean it succeeded in the way you > wanted. You might want to throw a count() in the mix to see if you are really > clearing out the table the way you want to. I wonder if there is a way to use the result of "count()" in \echo... > Also is the actual data file static from one run to the next? If you mean the data file that contains the list of tables, then yes. If you mean the data in the table itself, then no, the data changes - new records are added and old ones are updated. > Would also help to see the schema for the > table involved and maybe a sample of the data, if that is possible. > > A sample of the data would be a bit tricky, as this is customers' private information. But the table schema is: CREATE TABLE web1010.users ( user_id CHAR(32)PRIMARY KEY NOT NULL, whitelabel_id NUMERIC(21) NOT NULL, usernameVARCHAR(30) NOT NULL, passwordCHAR(32)NOT NULL, perms VARCHAR(255)NOT NULL, first_name VARCHAR(40) NULL, last_name VARCHAR(40) NULL, total_pointsINTEGER DEFAULT 0 NOT NULL, date_createdTIMESTAMP NOT NULL, date_birth TIMESTAMP NULL, gender INTEGER NULL, city_id NUMERIC(21) NULL, is_active SMALLINTNOT NULL, email VARCHAR(255)NULL, subscriptin_id NUMERIC(21) NULL, subscriptin_num_of_msg INTEGER NULL, subscriptin_date_start TIMESTAMP NULL, subscriptin_sent_datetime TIMESTAMP NULL, subscriptin_credit_left INTEGER NULL, subscriptin_status INTEGER NULL, subscriptin_sent_reference NUMERIC(21) NULL, first_time_subscribed VARCHAR(10) NULL, sms_credit INTEGER NULL, reg_pid NUMERIC(21) NULL, spam_fl SMALLINTNULL, constraint PK_USERS unique (whitelabel_id,username) ) ; I suppose this doesn't happen with other tables in the process, because most other tables don't have two unique constraints in them - most only have the primary key. But still, if everything is deleted from the table, this should not be an issue... I might take Dennis Brakhane's advice and replace the DELETE command with TRUNCATE, as I see no harm in doing so. Nevertheless, DELETE should either work or fail saying "could not delete because...". Otherwise PostgreSQL is not a very reliable... Thanks, Herouth
Re: [GENERAL] Copy/delete issue
Adrian Klaver wrote: > > > Are you sure the problem is not in "$datefield" = "*" . That the script that > formats the data file is not correctly adding "*" to the right file. Seems > almost like sometimes the second CMD is being run against the table that the > first CMD should be run on. In other words it is not doing a complete > delete , but a date based one, and you then import duplicate records. > > Thanks for your reply. The file containing the tables list is static - it doesn't change from one run to the next (unless I edit it personally). Herouth -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Copy/delete issue
I have a strange situation that occurs every now and again. We have a reports system that gathers all the data from our various production systems during the night, where we can run heavy reports on it without loading the production databases. I have two shell scripts that do this nightly transfer of data. The production database is Sybase. So I have a shell script that scans a list of tables and databases and dumps them into a format suitable for postgres COPY. After it dumps everything, another shell script scans the same list, and loads each dump file into the proper table. The shell script first runs psql with a DELETE command. For transaction tables (ones where data accumulates by date) the records for two days are deleted, and for non-transaction tables (ones that have records that might change but don't accumulate based on time) it's DELETE without WHERE. I run psql with ON_ERROR_STOP and check the exit status. If the DELETE failed, I should get an error status, so I do not proceed to the copy. Then I run psql again, with ON_ERROR_STOP, and run a \copy command that loads the data to the same table. For some reason, once in a while, that fails. Always on the same table - violating the unique constraint of the primary key. Now, this is impossible because there was a successful delete beforehand, as I said, and the data comes from a database where that same primary key is enforced. Moreover, when I re-run the script, everything runs fine. This happens at least once a week - always with the same table. Can anybody think of a reason why psql will not report an error on deletion? Or why it would tell me that a constraint has been violated when loading the same data 5 minutes later works fine? Thanks, Herouth Here is the relevant shell code (the relevant table has "*' in the file for datefield): # The names of the tables are stored in a text file exec 4<$TABLES_FILE dstamp N "Starting postgres load" >> $LOAD_LOG while read -u 4 ignored1 ignored2 local_table datefield do dstamp N "Now loading $local_table" >> $LOAD_LOG filename="$DUMPDIR/$local_table.tsv" # Stop if the dump file does not exist. if [ ! -f "$filename" ] then errexit "Dump file not found for table: $local_table" 1 >> $LOAD_LOG fi # If the datefield contains "*", it means the table contents are fully # replaced, otherwise use this as the field on which to limit the deletion. if [ "$datefield" = "*" ] then CMD="DELETE FROM $local_table" else CMD="DELETE FROM $local_table WHERE $datefield >= current_date - 2" fi # Run the deletion command echo -e "set ON_ERROR_STOP\\n$CMD;" | $PSQLCMD -q -f - > $TMPFILE 2>&1 # Report errors and stop the loop if any occured rc=$? if [ "$rc" != "0" ] then # Copy the error output, properly formatted, to the log file sed "s/^/$(date +%Y-%m-%d%t%T)E/" $TMPFILE >> $LOAD_LOG # Send mail message about the failure rm -f $TMPFILE errexit "Deletion failed with status $rc on table: $local_table" $rc >> $LOAD_LOG fi # Now run the load command echo -e "set ON_ERROR_STOP\\ncopy $local_table from $filename" | $PSQLCMD -q -f - > $TMPFILE 2>&1 rc=$? # Check for errors and report if [ "$rc" != "0" ] then # Copy the error output, properly formatted, to the log file sed "s/^/$(date +%Y-%m-%d%t%T)E/" $TMPFILE >> $LOAD_LOG # Send mail message about the failure rm -f $TMPFILE errexit "Copy failed with status $rc on table: $local_table" $rc >> $LOAD_LOG fi # Remove the dump file, as well as the output file from the psql command rm -f "$filename" # Update statistics with the ANALYZE command dstamp N "Updating statistics for $local_table" >> $LOAD_LOG echo -e "set ON_ERROR_STOP\\nANALYZE $local_table;" | $PSQLCMD -q -f - > $TMPFILE 2>&1 # Report errors and stop the loop if any occured rc=$? if [ "$rc" != "0" ] then # Copy the error output, properly formatted, to the log file sed "s/^/$(date +%Y-%m-%d%t%T)E/" $TMPFILE >> $LOAD_LOG # Send mail message about the failure rm -f $TMPFILE errexit "ANALYZE failed with status $rc on table: $local_table" $rc >> $LOAD_LOG fi done
Re: [GENERAL] char(xx) problem
At 4:02 +0200 on 17/12/1999, Gene Selkov, Jr. wrote: > I'm just wondering: are there any alternatives to blank padding? Why > is it done in the first place? That's how fixed-length char type works, since the early days of SQL. You come to expect it, which means that if you use legacy code that has a fixed-width char type, or you decided to use it for its time-saving possibilities, it should behave according to some way which has been established long ago. What I don't get is why, given two bpchar argument, Postgres doesn't just pad the shorter one to the length of the other and then compares, selects and whatnot. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herouth/personal/
Re: [GENERAL] Auto Ordering
At 20:41 +0200 on 27/10/1999, Stuart Rison wrote: > In the example you give, you could do the changes with two UPDATE > commands: > > 1) UPDATE questions SET order=0 WHERE order=5; > 2) UPDATE questions SET order=order+1 WHERE order<5; > > It becomes more tricky when you try and move a question to a position > other than the first one (e.g. question #6 to move to position #3 and all > other questions to be shifted accordingly). > > This would take three UPDATEs: > > 1) UPDATE questions SET order=0 WHERE order=6; > 2) UPDATE questions SET order=order+1 WHERE order>=3 and order<6; > 3) UPDATE questions SET order=3 WHERE order=0; Here is an alternative method of thinking which I used in the past - it depends on other factors whether this is good or not. If only the order of the questions is important, and not the actual number, then you can use fractions. You can use a floating point field, or a fixed point (numeric) one, or just an int field that normally gets the numbers 100, 200, 300. Changing order then becomes very easy: UPDATE questions SET the_order=50 WHERE the_order=600; Will change questions 100,200,300,400,500,600,700 To 50,100,200,300,400,500,700. >From time to time, though, you will have to renumber your questions, to make sure you don't run out of fraction precision. You can do that with something like: SELECT the_order INTO TABLE temp_numbers FROM questions ORDER BY the_order; CREATE SEQUENCE new_seq INCREMENT 100 START 100; UPDATE questions SET the_order = nextval( 'new_seq' ) WHERE questions.the_order = temp_numbers.the_order; DROP SEQUENCE new_seq; DROP TABLE temp_numbers; The idea is to do the renumbering in batch, and have a small penalty in "real time". Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [HACKERS] Re: [GENERAL] Postgres INSERTs much slower thanMySQL?
At 17:08 +0200 on 22/10/1999, Tom Lane wrote: > In the meantime, the conventional wisdom is still that you should use > COPY, if possible, for bulk data loading. (If you need default values > inserted in some columns then this won't do...) Yes it would - in two steps. COPY to a temp table that only has the non-default columns. Then INSERT ... SELECT ... from that temp table to your "real" table. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Startup Script
At 20:13 +0200 on 12/10/1999, Duncan Kinder wrote: > pg:2345:respawn:/bin/su - Postgres -c > "/usr/local/pgsql/bin/postmaster -D/usr/local/pgsql/data >> > /usr/local/pgsql/server.log 2>&1 > I would like to know how to edit this language so that Postgres will > automatically start with the -i flag. > > This would enable me to work with the very interesing looking kpsql add-on, > which will not work unless I can set this -i flag. Just add "-i" before the "-D" in the postmaster command. pg:2345:respawn:/bin/su - Postgres -c "/usr/local/pgsql/bin/postmaster -i -D/usr/local/pgsql/data >> /usr/local/pgsql/server.log 2>&1 http://telem.openu.ac.il/~herutma
Re: [GENERAL] stored procedure revisited
At 09:33 +0200 on 10/10/1999, Yin-So Chen wrote: > I think I am missing something very obvious here. Can someone please > kind enough explain to me, is there SP for postgresql and where I can > find more information about it? If there isn't, is there any plan for > implementation? Seems there needs to be an extra table that would hold > the parsed query tree and a rewrite of the parser to make SP work. AFAIK, there are no stored procedures in PostgreSQL. Maybe they are in a low priority, or the developers are just trying to avoid the problems of doing them. I'm not a developer, but I get the impression that sets of rows are not well-abstracted in PostgreSQL. I think this is also the reason why there are no subselects in target lists yet (correct me if I'm wrong). Anyway, sometimes the proper solution for things you stated (i.e. return a set of rows without using the entire query every time) are more correctly done with views. Other things (procedural things that don't result in sets of rows) are handled by functions. So you have a rather small niche for which only stored procedures are the most proper tool, and which is not covered in PostgreSQL. If you think this niche is important, maybe you should convince the rest of us here (I never needed to use a stored procedure so far, and I don't remember many people using them five years ago when I was in an Oracle environment). Or you could prioritize it with money... Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] btree index on a char(8) field (fwd)
At 15:30 +0200 on 06/10/1999, Frank Mandarino wrote: > main=> create index ven_code_idx on vendor using btree (ven_code char_ops); > CREATE I didn't have time to check this, but the problem may be caused by the incorrect ops you are using. For char(N) it should be bpchar_ops, not char_ops. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Foreign Key
At 01:10 +0200 on 06/10/1999, Howie wrote: > for now, refint ( $PGSQL_SRC_ROOT/contrib/spi/ ) is what one should be > using for foreign keys. requires two triggers, one on the parent and one > on the child. works nicely. Does it? I was under the impression that it supported cascading deletes but not cascading updates. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
[GENERAL] Re: [INTERFACES] Q: Mac: Openlink->Linux: Openlink-PostgreSQL
At 09:12 +0200 on 22/09/1999, Jelle Ruttenberg wrote: > - password-file: PGDATA/pg_pwd > - users in the password-file: added with CREATE USER > - the passwords aren't encrypted > - connecting the database with 'psql -u' works > - OpenLink uses the generic database-agent for PostgreSQL (95?) to > connect to the database I am a bit on shaky ground here, but I think the frontend-backend protocol changed sometime between version 6.2 and 6.3. The specific change was in fact related to password authentication. (Gurus, correct me if I'm mistaken in this). The bottom line of all this is that if you want to use passwords, you have to have a frontend-backend agent/driver/module which is compatible with the new protocol. If you mentioned Postgre 95, it's probably an old, old agent. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Problem connecting NT-psqlODBC to Linux-PostgreSQL
At 12:37 +0200 on 07/09/1999, Teodor Cimpoesu wrote: > Second, hmm, your port is 113 but I know the default is 5432. > If nobody is listening to a port where you are trying to connect, you > get the same respone (connection refused). No, you got that wrong. Port 113 is on the CLIENT side. It goes like this: * NT tries to connect to port 5432 on Postgres server * The pg_hba.conf defines the authentication for this client as "ident" * Therefore Postgres server tries to connect to the ident server on the NT. This is done through port 113 on the NT * Postgres doesn't find anybody on the NT listening to that port. That means no ident server is running on the NT. * Connection is therefore refused because user could not be authenticated. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Problem connecting NT-psqlODBC to Linux-PostgreSQL
At 11:31 +0200 on 07/09/1999, Jelle Ruttenberg wrote: > Unable to connect to Ident server on the host which is trying to connect > to Postgres (IP address 212.52.7.241, Port 113). errno = Connection > refused (111) This seems to indicate that your Postgres authentication method is "ident", rather than "trust" or "password". It needs an ident server to be running on the client computer, to authenticate that the user making the connection is indeed who he claims to be. That is, on your NT. Either choose a different authentication method, or run an ident server on the NT (is there such a beast?). Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: Ô×: [GENERAL] GEQO and KSQO problem.
At 17:31 +0300 on 06/09/1999, Natalya S. Makushina wrote: > > I can put the part of query outside the parentheses. ... > But if i increased number of "OR" in query, the server was down and >worked very,very slowly. I can't see any rezult from optimizer. > It's very strange thing! The postgres optimizer was never very good with OR clauses. That's why I suggested the format with the regular expressions in the end of my message. I wonder if you can get a better result by using a union: SELECT . WHERE AND lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' UNION SELECT . WHERE AND lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' ... etc. Also try UNION ALL. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Get TRANSACTION LEVEL ?
At 11:56 +0300 on 05/09/1999, Alois Maier wrote: >I know that I can set the transaction level with the SET TRANSACTION >LEVEL statement. How can I get the transaction level from SQL ? Normally, the counterpart of SET is SHOW. Did you try SHOW TRANSACTION LEVEL? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] GEQO and KSQO problem.
At 11:45 +0300 on 02/09/1999, Natalya S. Makushina wrote: > where CLIENTS.CLIENTID=SOTRUD.CLIENTID and > ( CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (NOT >CLIENTS.ARH) and lower(SOTRUD.EMAIL) LIKE lower('[EMAIL PROTECTED]%') > OR CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (not >CLIENTS.ARH) and lower(SOTRUD.EMAIL) LIKE lower('[EMAIL PROTECTED]%') > OR CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (not >CLIENTS.ARH) and lower(SOTRUD.EMAIL) LIKE lower('[EMAIL PROTECTED]%') > OR CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (not >CLIENTS.ARH) and lower(SOTRUD.EMAIL) LIKE lower('[EMAIL PROTECTED]%') > OR CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (not >CLIENTS.ARH) and lower(SOTRUD.EMAIL) LIKE lower('[EMAIL PROTECTED]%') > ) > order by CLIENTS.NEW_F, CLIENTS.NAME_1" I wonder if this is all necessary? Can't you take the part CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and (not CLIENTS.ARH) Outside the parentheses and leave only the LIKE comparisons inside? Also, there is no point in running "lower" on a string which is known in advance to contain only lowercase letters, which is true for most literal strings (If your application creates this, you can always do the conversion on the client side before putting it into the query). It only leaks memory. Thus, if you try to rewrite the WHERE clause as follows, do you get any improvement? where CLIENTS.CLIENTID=SOTRUD.CLIENTID and CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and not CLIENTS.ARH and ( lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' or lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' or lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' or lower(SOTRUD.EMAIL) LIKE '[EMAIL PROTECTED]%' ) ... I think the optimizer would be most happy if you avoid the OR altogether by using alternatives in a regular expression instead of like. This will also allow you to use case insensitive comparison and give up the 'lower': where CLIENTS.CLIENTID=SOTRUD.CLIENTID and CLIENTS.PRINADL=PRINADLEG.PRINADL and CLIENTS.FLG_MY and not CLIENTS.ARH and SORTUD.EMAIL ~* 'ruslanmr@hotmail\\.com|matukin@hotmail\\.com|knirti@kaluga\\.ru|avk@vniicom\\.v su\\.ru'; Note that you have to put two slashes before each period in the string, because a period is special in regular expressions. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] CVS Import/Export
At 17:14 +0300 on 18/08/1999, Bruce Tong wrote: > How do I import/export comma delimited tables? > > I thought a combination of pg_dump and psql might do it, but if so I must > have missed it. I saw a mention of it for pgaccess, but I'm looking for > something I can put in a shell script. It has nothing to do with pgaccess. The way to import/export any tables is using either the COPY command in PostgreSQL's SQL dialect, or the \copy command in psql. The difference between them is in where they look for the file to convert to/from. The COPY command is executed by the backend, and looks for a file in the backend's machine. The \copy looks on the client machine that runs the psql. Since, more often than not, this is the same machine, the best way to remember is that COPY is executed by the backend and therefore the file must be readable to the postgres superuser (or writable for an export), and \copy runs in the client, so it should be readable/writable to the one who runs the psql. COPY has an option to read the standard input instead of a file, which is how clients like psql are able to write things like \copy. You can use COPY FROM STDIN in shell scripts. COPY is better that \copy as it allows you to set a delimiter, which \copy does not - it always expects tabs. Anyway, this imports data from a file named "stam.txt" into the table "test5" of the database "testing": psql -c 'COPY test5 FROM stdin' testing < stam.txt The following exports the same table: psql -qc 'COPY test5 TO stdin' testing > stam.txt Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] storing a tree-like structure and selecting pathfrom leaf to root
At 09:44 +0300 on 15/08/1999, Jan Vicherek wrote: > Q1: What is a good way to store this tree in ? (This is somewhat generic > question, so it may be a good FAQ candidate.) I want SELECTs to be fast, > and INSERTs/UPDATEs I don't care. Would making custom datatype help ? How? About a year and a half ago, there was a book recommendation about this issue. The book discusses advanced data structures representation with SQL. Now that PostgreSQL has subqueries and unions, it becomes more relevant. The book was: Joe Celko's SQL for Smarties Advanced SQL Programming The publisher: Morgan Kaufmann Publishers, Inc 340 Pine St 6th Floor San Francisco CA 94104-33205 USA 415-392-2665 [EMAIL PROTECTED] http://www.mkp.com The original poster of this recommendation was Terry Harple, and it was on the (now defunct) QUESTIONS list. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: Fw: [GENERAL] uppercase of char16
At 14:30 +0300 on 10/08/1999, =?iso-8859-9?Q?Safa_Pilavc=FD?= wrote: > Please help Char16 has little support. Any possibility of changing the definition to char(16)? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Search
At 03:43 +0300 on 01/08/1999, Gilles Darold wrote: > I don't know about windows-1251. Perhaps this can't help you. But if > you have japanese in you database, you can proceed a search on it so > why not windows-1251 ? Because the backend has to know that the lowercase for char NNN in this codepage is MMM. This is different from one locale to the next. If it were windows-1255 (Hebrew), there would be no lowercase at all for any character above 224. So of course it's different than Japanese. There can be two solutions to this problem: 1) Write a function using SPI, install it on the backend, and use it for the comparison. 2) Create the new locale, or at least the LC_CTYPE part of the locale, on the unix you are using. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Large Object questions...
At 08:31 +0300 on 30/07/1999, John Huttley wrote: > I'm busy writing a faxserver application where all the fax page data is > stored as a blob. > > Its just so easy to use... How will you be backing it up? pg_dump never dumped large objects. IMO, if you need a specialized backup script, plus a non-standard interface for writing into them and reading from them, and they are not deleted when you drop the row referring to them, then you may as well use files, and store only the path in Postgres for easy lookup. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] int2/int4 Failure with 6.5.1 and SlackWare 4.0
At 19:16 +0300 on 28/07/1999, Charles Tassell wrote: > Here is the error from int2.out: > > QUERY: CREATE TABLE INT2_TBL(f1 int2); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('0'); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('1234'); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('-1234'); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('34.5'); > ERROR: pg_atoi: error in "34.5": can't parse ".5" > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('32767'); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('-32767'); > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('10'); > ERROR: pg_atoi: error reading "10": Math result not representable > QUERY: INSERT INTO INT2_TBL(f1) VALUES ('asdf'); > ERROR: pg_atoi: error in "asdf": can't parse "asdf" > QUERY: SELECT '' AS five, INT2_TBL.*; > > > Any ideas what is causing this? It seems to be a problem with the pg_atoi > function giving a fatal error on any data that is not formatted exactly > right, or too large for it's return type. You are looking in the wrong direction. Some of the regression tests actually produce errors, and the regression test runs the same test, and is supposed to produce the same errors. In this test, 34.5 is not a valid integer. 10 is an integer, but outside the 2-byte range (which goes -32768 to 32767). And 'asdf' is simply not a number. So, all these error messages are good and right. In a regression test, always look at the *diff* files. They tell you what the differences were between the original and your test, whether the original result was a normal one or a deliberate error. My guess is that in your case, you will simply find that the difference results from some different error message. Perhaps your system would say "out of range" instead of "Math result not representable". Check the diff. If this is true, then you have nothing to worry about. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] escaping wildcard chars
At 08:16 +0300 on 26/07/1999, Dan Wilson wrote: > SELECT typname from pg_type WHERE NOT LIKE '_%' > > It gives me an empty set. > > Is there any way to escape the underscore. I tried to use '\_%', but that > didn't help. Any suggestions? Yes. Use '\\_%' instead. You see, a backslash is interpreted immediately as "take the next char literally". This passes the underscore literally to LIKE. Which is the same as '_%'. If you put a double backslash, the first backslash takes the second one literally, thus passing '\_%' to LIKE. And then LIKE knows that it should treat the underscore as non-special. Ugly, ugly. I think we had a discussion either here or in one of the other lists regarding the ESCAPE clause to LIKE. This behavior means that even if we write ESCAPE '\', it won't work (Actually, it should be '\\'. I really hate those backslashes. They are blatantly incompatible with SQL92 and will cause standard SQL to fail on Postgres, (and of course, Postgres code to fail on other databases). There should be a setting, either in the backend or in a SET command, such as "BACKSLASH_BEHAVIOR", which will be either "literal" or "escape". It can default to the current behavior (namely "escape") so that current code won't fail, but will enable people to write sane standard code. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Installation of postgresql-6.5.1 data missing ?
At 14:37 +0300 on 24/07/1999, Jesper K. Pedersen wrote: > I have been using a much older version of postgresql and decided to > "trash" it and go to the 6.5.1beta1 > > All installation of the server/client/devel went fine - but i am unable > to install the default database - postgresql-data-X.X.X as i cant find > any for 6.5.1 > > Anyone have any help that will get me through this ? > I cant find any help in the doc's for this problem - so I am sure its > just that there is some basic "thing" i overlooked. Were you installing from an RPM perhaps? Common PostgreSQL RPMs were somehow separated into three packages, though for the life of me I can't understand why the data package is needed. You should be able to create the default database using initdb - unless they didn't RPM the initdb executable? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Howto convert arrays 2 query results
At 10:13 +0300 on 09/06/1999, Jeroen Schaap wrote: > Do you know of any way to generally convert arrays into query results? > > I know it is better to implement arrays as tables, but that results in > unreadable tables (with 10 rows with id=1, 15 with id=2, 2 with id=3 ad > infundum...). > > So is there any way to convert an array into a table? Should I > write a function or a C-function? It's not entirely clear what you want. The reason to keep arrays together in a separate table is organizational. The way you want to present the arrays shoud not affect the way they are organized. If it bothers you that a query returns something like id person child === === === 1SusanTom 1SusanMerry 1SusanDonna 2George Ben 2George Peggy 3Morris Elias And you want it to show something like: Person Children == SusanTom, Merry, Donna George Ben, Peggy Morris Elias What you do is write it this way in the frontend. It depends on your favourite frontend language, but the general algorithm should be something along the lines of: last_id = 0; while ( still_more_tuples ) get_next_tuple; if ( tuple.id = last_id ) print( "," + tuple.child ) else print( + tuple.person + + tuple.child ) end if last_id = tuple.id; end while Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] ownership of tables. . .
At 01:06 +0300 on 29/05/1999, JT Kirkpatrick wrote: > can i change the owner of tables, indexes, etc in postgres??? from a bash > prompt of course postgres owns everything. but when in psql you type \dt, > i'd like for another person to own the tables. any way to do so?? Log into psql as that person, and create the tables. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] pg_database corrupted(?) If so, what do I do????
At 18:03 +0300 on 11/05/1999, Jonny Hinojosa wrote: > No, but I have tried to figure out how to do just that. I have found no way > to effect these updates. So which part of the docs did I sleep through??? > > All help is GREATLY appreciated. I can't test any advice I give on my own system, as it is used for production. But let's see if we can do something. If you tried to update the pg_database with UPDATE pg_database SET datpath=datname WHERE datpath <> datname; and it didn't work, you may try the following: make a copy of the hom and cdmwhere directories in the data directory, for backup. Try to DROP DATABASE on the above databases. See if they have disappeared from the database list (you can use select * from pg_database instead of psql -l if you are already in psql). If they did, remove or rename the hom and cdmwhere directories, and re-create them, using the user. You are now supposed to have two new empty databases with the old names. Remove the newly created directories and rename the old ones back. Check to see if you can connect. It may be advisable to shut down the postmaster when you are changing things in the data directory, and restart it for the next psql session. Now, supposed the DROP DATABASE didn't work, and complains that it doesn't find the directories. Then, if it were me, I would create copies of the original hom and cdmwhere directories, but give them the names that appear in the pg_database table (check to see if there are extra spaces there, though). You can give names that contain spaces in unix, it's no problem. And then I'd try the drop again. I hope any of these suggestions helps. Just make sure you have a backup copy of the directories somewhere safe. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
RE: [GENERAL] pg_database corrupted(?) If so, what do I do????
At 22:26 +0300 on 10/05/1999, Jonny Hinojosa wrote: > The last 2 entries have been corrupted. How do I (can I) correct these > entries ?? Have you tried logging into psql (template1) as postgres and updating the pg_database table? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Restore from dump file: parse error
At 10:37 +0300 on 10/05/1999, darold wrote: > Hi, > > Well to quote that single quote, you have to quote that single quote :-) > ex : s_field='suivre l'actualite' > => s_field='suivre l''actualite' > > pg_dump -D mydatabase > recover.sql or other options will never solve > this pg_dump feature. > > You can make a little program with perl to process your entire file by > search single quote not after an = and not before a , and space and ) and (. > > Regards, > > Gilles Darold Just a minute. I tried dumping and restoring fields with and without single quotes in them in 6.4.2, and there was no problem, and I didn't need to change anything. Single quotes should not be a problem for COPY, because they are not considered delimiters. The strings are not surrounded with quotes. Thus, the strings you see in the COPY commands are the actual strings that you have in the database, except for newlines and tabs that have a backslash before them. So, the question is what went wrong with the dump. Has it dumped correctly, that is, with a single, unescaped quote where a single quote is in the fields? In that case, the 6.3.2 psql is the culprit, and since your issue is with upgrading to 6.4.2, you need not worry. If, on the other hand, the dump makes the copy with single quotes doubled or backslashed, you will have to use some sed or perl script to remove that, because they will not be interpreted correctly. This is the 6.3.2 dump's fault. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] Any ideas why this doesn't work or how to rewriteit?
At 21:04 +0300 on 29/04/1999, Brett W. McCoy wrote: > I think, Aaron, you could get a count of distinct customer names like this: > > SELECT DISTINCT customer_username, COUNT(*) FROM customerdata > GROUP BY customer_username; > > This will give you 2 columns, one with the distinct customer_usernames > and the second with the count of each. The GROUP BY caluse is important > here. This looks like what you wanted in your original query. No, Brett. COUNT( DISTINCT ...) is supposed to count the number of distinct names in a table. Here, I created a test table: testing=> select * from test; customer moshe david hanna david sarah moshe suzanne moshe moshe (9 rows) The distinct names are: testing=> select distinct customer testing-> from test; customer david hanna moshe sarah suzanne (5 rows) So clearly, the datum he wanted was "5" - there are five distinct customers here. Your query, however, gives the following: testing=> select distinct customer, count(*) testing-> from test testing-> group by customer; customer|count +- david |2 hanna |1 moshe |4 sarah |1 suzanne |1 (5 rows) Which shows him the number of REPETITIONS on each distinct name. My ugly query gives: testing=> select count(*) testing-> from test t1 testing-> where int( oid ) = ( testing-> SELECT min( int( t2.oid ) ) testing-> FROM test t2 testing-> WHERE t2.customer = t1.customer testing-> ); count - 5 (1 row) And this is the exact number of distinct names in the table. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] COPY with default values won't work?
At 07:15 +0200 on 24/03/1999, Charles Tassell wrote: > > I'm trying to copy data into the following table: > > CREATE SEQUENCE seq_account_type_ndx; > > CREATE TABLE accounts ( > Account_Type_NDXint4 not null default > nextval('seq_account_type_ndx'), > Account_NameText > ); > > Using this as a datafile: > \N|Box > \N|NetSurfer120 > \N|eMailer > \N|eMailerLite > > I've tried writing the code in C using libpq, using the copy command as the > postgres super user, or using \copy as my normal user. NONE will work with > the "not null" in there, and if I remove it, it just inserts a null value > into account_type_ndx, without using the default. I've also tried > switching the default to a number (ie default 12) instead of the nextval of > the sequence, with no better luck. > > Here is the copy command I tend to use: > COPY accounts from stdin USING delimiters '|' > or \COPY accounts from '/tmp/datafile.txt' USING delimiters '|' > > Any ideas? I thought the above would work, too, but apparently it doesn't. So, two possible solutions: A) Update with the sequence after you have copied. 1) Create the table without the NOT NULL. 2) Make the copy 3) Use UPDATE accounts SET Account_Type_NDX = nextval( 'seq_account_type_ndx' ); 4) Vacuum. B) Copy into a separate table and insert. 1) Create the table, including the NOT NULL and everything. 2) Create a temporary table, with all the same fields, without NOT NULL. 3) Copy into the temporary table. 4) Use: INSERT INTO accounts ( Account_Name ) SELECT Account_Name FROM temp_accounts; 5) Drop the temp_accounts table. Variation: Create the temp_accounts table without the Account_Type_NDX field. It's null anyway. Have your copy files without the "\N|" part. Saves the transfer of three bytes per row and the insertion of a null value per row. Makes things a wee bit faster. My personal favourite is plan (B), because it allows building the table with the "NOT NULL" constraint, and does not require you to remember the name of the sequence. The general principle here is: 1) Look at your table and decide which fields should be inserted from an external data source, and which from an internal data source (these are usually the fields that have a default value). 2) Create a temporary table that contains only the fields that need to be fed externally. 3) Copy your data into that table. The copy files need not have any NULL value unless it truely stands for "no value here". 4) Insert into your real table using a SELECT statement. The INSERT clause should include only the names of "external source" fields. This will cause the internal ones to be filled from the default source. This method allows also the use of functions and stuff when populating the table. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] fork() bad
At 17:48 +0200 on 22/03/1999, Richi Plana wrote: > As some of you may know, I'm hacking Ascend RADIUS 2.01 to look up a > PostgreSQL database for authentication and log to PG for accounting. > Normally, RADIUS fork()s once for Accounting and fork()s for each > Authentication request. That's a lot of fork()ing and establishing > connections to the backend. It's slow, but it's better than junking > whatever code I've written so far. > > If anyone can give a better suggestion, I'm all ears. Also, if anyone > wants the code when it's done, try asking. ;^) Why don't you try to synchronize access to the connection between the various processes? You know, lock it in an exclusive lock, on an inter-process basis, such that when one process accesses it, the others have to wait. Or you can have a few connections open, so that the bottleneck is wider. You know, like you would treat any shared object in an inter-process environment? Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] The value returned by autoinc ?
At 11:59 +0200 on 15/3/99, Silvio Emanuel Barbosa de Macedo wrote: > If there is an insert between my INSERT and SELECT, won't the counter be > increased ? The only way I can understand this is the transaction locks > inserts... so, in fact there could not exist another insert... The counter is increased - but you get the last value *you* used, not the other. It's more or less like this: counter = 5; Process1 hidden variable = null; Process2 hidden var = null; -- Process1 inserts a tuple, calling nextval. counter = 6; Process1 hidden variable = 6; -- Second process inserts a tuple. counter = 7; Process1 hidden variable = 6. Process2 hidden var = 7; -- Process1 now wants to know which number it entered, calling currval. -- Currval takes the value in the hidden variable. It's 6. -- Now suppose process1 makes another insertion. Then: counter = 8; Process1 hidden variable = 8. Process2 hidden var = 8; Do you understand? Whenever you make a call to currval, your process gets the value thus retrieved and keeps it. The two operations are done atomically (uncrementing and checking what value was taken), so it doesn't actually matter when you make the call to currval - the correct value is already available to you. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] The value returned by autoinc ?
At 2:57 +0200 on 15/3/99, Silvio Emanuel Barbosa de Macedo wrote: > When I insert data into a table with a sequence associated to a column > (and the required trigger), how can I know the value the sequence has > just generated ? (think in parallel accesses) > > Would this approach be the answer ? > begin work > insert... > select max... > commit No, this approach is a waste of precious time... The correct approach is: INSERT... SELECT currval( 'seq_name' ); currval gives you the last value the sequence has given to the current session. That is, it won't work if you use it before the insertion (because the sequence didn't give you a number yet). It will also give you the correct number even if between the INSERT and the SELECT, another process or another connection also made an insert. Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma
Re: [GENERAL] daily check for expired data ?
At 22:48 +0200 on 11/3/99, Ralf Weidemann wrote: > > how could I do an automatic daily check > to delete some expired data ? I mean > can I have a cron functionality in post- > gresql ? You don't need to have cron functionality in postgresql when you have cron functionality in cron. :) What you have to do is make a simple script, more or less like this: #!/usr/bin/sh PGHOST=... PGPORT=... PGUSER=... psql my_database
Re: [GENERAL] slow inserts and updates on large tables
At 16:47 +0200 on 17/2/99, Jim Mercer wrote: > i will test this with my insertama program, but i see some problems with >this. > > firstly, it assumes that all of your applications programs are updated each > time you modify the structure of the table. This is true. That's the sacrifice you get for COPY's fast transfers. > also, it doesn't seem to address the issue of updates, which suffer >from worse performance than inserts. Did you try my trick, but without removing the indices? Move the data over to a temporary table, delete from the original, insert updated data back? (Assuming you don't have a separate update for each line). Herouth -- Herouth Maoz, Internet developer. Open University of Israel - Telem project http://telem.openu.ac.il/~herutma