Re: [GENERAL] Alternative to Multi-Master Replication with 2 Data centers??
On sáb, 2014-03-29 at 12:50 -0700, John R Pierce wrote: On 3/29/2014 11:35 AM, ethode wrote: Currently we're considering several options, of which Magic replication appears to be the top option. fixed that for you, hope that helps! kidding aside, there's no magic bullet here that does what you describe without introducing its own complex set of problems and performance issues, not the least of which is transaction conflict resolution. checkout xDB Replication Server v5.0 Multi-Master http://www.enterprisedb.com/products-services-training/products-overview/xdb-replication-server-multi-master maybe it adjust to your requirements. -- john r pierce 37N 122W somewhere on the middle of the left coast -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN][GENERAL] openvz and shared memory trouble
On Sat, Mar 29, 2014 at 6:17 PM, Adrian Klaver adrian.kla...@aklaver.comwrote: On 03/29/2014 08:19 AM, Willy-Bas Loos wrote: The error that shows up is a Bus error. That's on the replication slave. Here's the log about it: 2014-03-29 12:41:33 CET db: ip: us: FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. cp: cannot stat `/data/postgresql/9.1/main/wal_archive/00010072000A': No such file or directory 2014-03-29 12:41:33 CET db: ip: us: LOG: unexpected pageaddr 71/E9DA in log file 114, segment 10, offset 14286848 cp: cannot stat `/data/postgresql/9.1/main/wal_archive/00010072000A': No such file or directory 2014-03-29 12:41:33 CET db: ip: us: LOG: streaming replication successfully connected to primary 2014-03-29 12:41:48 CET db: ip: us: LOG: startup process (PID 17452) was terminated by signal 7: Bus error 2014-03-29 12:41:48 CET db: ip: us: LOG: terminating any other active server processes 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos WARNING: terminating connection because of crash of another server process 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos HINT: In a moment you should be able to reconnect to the database and repeat your command. Well what I am seeing are WAL log errors. One saying no file is present, the other pointing at a possible file corruption. Those are normal notices, nothing to worry about. Shared memory problems are offered as a possible cause only. Right now I would say we are seeing only half the picture. The Postgres logs from the same time period for the primary server, as well as the system logs for the openvz container would help fill in the other half of the picture. Here's the log from the primary postgres server: 2014-03-29 12:41:29 CET db:wbloos ip:[local] us:wbloos NOTICE: ALTER TABLE will create implicit sequence test_x_seq for serial column test.x 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: SSL renegotiation failure 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: SSL error: unexpected record 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: could not send data to client: Connection reset by peer 2014-03-29 12:41:48 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: could not receive data from client: Connection reset by peer 2014-03-29 12:41:48 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: unexpected EOF on standby connection (the SSL renegotiation failure happens all the time, without the crash) And here's the syslog form the container: Mar 29 12:41:01 mycontainer snmpd[8819]: Connection from UDP: [xxx.xxx.xxx.xxx]:59090-[xxx.xxx.xxx.xxx] Mar 29 12:42:30 mycontainer snmpd[8819]: Connection from UDP: [xxx.xxx.xxx.xxx]:35949-[xxx.xxx.xxx.xxx] The log on the host doesn't say anything interesting either. A cursory look at memory management in openvz shows it is different from other virtualization software and physical machines. Whether that is a problem would seem to be dependent on where you are on the learning curve:) That sounds like there is a solution to the problem, all you have to do is find out what it is. There doesn't seem to be a variable in the beancounters or anywhere else that can prevent the bus error from happening. There's seems to be no separate way of guaranteeing shared memory. There's no OOM killer active either, nor is host or server running short of memory. I'm still worried that it's like Tom Lane said in another discussion:So basically, you've got a broken kernel here: it claimed to give PG circa (135MB) of memory, but what's actually there is only about (128MB). I don't see any connection between those numbers and the shmmax/shmall settings, either --- so I think this must be some busted implementation of a VM-level limitation. (here: http://www.postgresql.org/message-id/CAK3UJREBcyVBtr8D7vMfU=uddkjxkrpngcuy8eryb0tmfke...@mail.gmail.com ) And it makes me wonder what else may be issues that arise from that. But especially, what i can do about it. Cheers, WBL -- Quality comes from focus and clarity of purpose -- Mark Shuttleworth
[GENERAL] PSQL log file
Hi Is there any option in psql version 9.2.4, to insert the date time also inside the psql log files(in path /var/log/postgresql) ? -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] PSQL log file
Is there any option in psql version 9.2.4, to insert the date time also inside the psql log files(in path /var/log/postgresql) ? If you look in postgresql.conf for your installation, you can change the log prefix by setting log_line_prefix. If you add %t to that string, you'll get a timestamp + date value with every entry. The postgresql.conf file should also list all of the honored prefixes, or you can use this URL for more guidance: http://www.postgresql.org/docs/9.2/static/runtime-config-logging.html -- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd | Suite 400 | Chicago IL, 60604 312-676-8870 stho...@optionshouse.com __ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN][GENERAL] openvz and shared memory trouble
On 03/31/2014 04:12 AM, Willy-Bas Loos wrote: On Sat, Mar 29, 2014 at 6:17 PM, Adrian Klaver adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com wrote: On 03/29/2014 08:19 AM, Willy-Bas Loos wrote: The error that shows up is a Bus error. That's on the replication slave. Here's the log about it: 2014-03-29 12:41:33 CET db: ip: us: FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. cp: cannot stat `/data/postgresql/9.1/main/__wal_archive/__00010072000A': No such file or directory 2014-03-29 12:41:33 CET db: ip: us: LOG: unexpected pageaddr 71/E9DA in log file 114, segment 10, offset 14286848 cp: cannot stat `/data/postgresql/9.1/main/__wal_archive/__00010072000A': No such file or directory 2014-03-29 12:41:33 CET db: ip: us: LOG: streaming replication successfully connected to primary 2014-03-29 12:41:48 CET db: ip: us: LOG: startup process (PID 17452) was terminated by signal 7: Bus error 2014-03-29 12:41:48 CET db: ip: us: LOG: terminating any other active server processes 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos WARNING: terminating connection because of crash of another server process 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2014-03-29 12:41:48 CET db:wbloos ip:[local] us:wbloos HINT: In a moment you should be able to reconnect to the database and repeat your command. Well what I am seeing are WAL log errors. One saying no file is present, the other pointing at a possible file corruption. Those are normal notices, nothing to worry about. Well other then they cause the standby to reconnect to the primary, during which a crash occurs. Shared memory problems are offered as a possible cause only. Right now I would say we are seeing only half the picture. The Postgres logs from the same time period for the primary server, as well as the system logs for the openvz container would help fill in the other half of the picture. Here's the log from the primary postgres server: 2014-03-29 12:41:29 CET db:wbloos ip:[local] us:wbloos NOTICE: ALTER TABLE will create implicit sequence test_x_seq for serial column test.x 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: SSL renegotiation failure 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: SSL error: unexpected record 2014-03-29 12:41:33 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: could not send data to client: Connection reset by peer 2014-03-29 12:41:48 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: could not receive data from client: Connection reset by peer 2014-03-29 12:41:48 CET db:[unknown] ip:xxx.xxx.xxx.xxx us:replication LOG: unexpected EOF on standby connection (the SSL renegotiation failure happens all the time, without the crash) And here's the syslog form the container: Mar 29 12:41:01 mycontainer snmpd[8819]: Connection from UDP: [xxx.xxx.xxx.xxx]:59090-[xxx.xxx.xxx.xxx] Mar 29 12:42:30 mycontainer snmpd[8819]: Connection from UDP: [xxx.xxx.xxx.xxx]:35949-[xxx.xxx.xxx.xxx] The log on the host doesn't say anything interesting either. A cursory look at memory management in openvz shows it is different from other virtualization software and physical machines. Whether that is a problem would seem to be dependent on where you are on the learning curve:) That sounds like there is a solution to the problem, all you have to do is find out what it is. There doesn't seem to be a variable in the beancounters or anywhere else that can prevent the bus error from happening. There's seems to be no separate way of guaranteeing shared memory. There's no OOM killer active either, nor is host or server running short of memory. At this point I am not sure it is even obvious what is causing the error, so finding a solution would be a hit or miss affair at best. I'm still worried that it's like Tom Lane said in another discussion:So basically, you've got a broken kernel here: it claimed to give PG circa (135MB) of memory, but what's actually there is only about (128MB). I don't see any connection between those numbers and the shmmax/shmall settings, either --- so I think this must be some busted implementation of a VM-level limitation. (here:
Re: [ADMIN][GENERAL] openvz and shared memory trouble
On 03/31/2014 04:12 AM, Willy-Bas Loos wrote: On Sat, Mar 29, 2014 at 6:17 PM, Adrian Klaver adrian.kla...@aklaver.com mailto:adrian.kla...@aklaver.com wrote: And it makes me wonder what else may be issues that arise from that. But especially, what i can do about it. Had to go through it a couple of times, and look up the acronyms, but the thread below seems to deal with the whole shared memory allocation and setting process in openvz quite well: http://forum.openvz.org/index.php?t=msggoto=12061srch=postgresql+shared#msg_12061 In particular, where UBC is User Beancounters and the sysctls is the one in the container.: UBC shmpages parameter controls all the shared memory which possible to allocate either via IPC or shmem (e.g. tmpfs). sysctls above are only for IPC SYSv5 shared memory. combine that with, where VE is Vitual Environment(container): check the same in VE please. AFAICS, by default kernel sets: #define SHMMAX 0x200 which is 32Mb. So you have to increase it in VE /etc/sysctl.conf file or in /proc. and there may be a path to a solution. Cheers, WBL -- Quality comes from focus and clarity of purpose -- Mark Shuttleworth -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN][GENERAL] openvz and shared memory trouble
Adrian Klaver adrian.kla...@aklaver.com writes: On 03/31/2014 04:12 AM, Willy-Bas Loos wrote: I'm still worried that it's like Tom Lane said in another discussion:So basically, you've got a broken kernel here: it claimed to give PG circa (135MB) of memory, but what's actually there is only about (128MB). I don't see any connection between those numbers and the shmmax/shmall settings, either --- so I think this must be some busted implementation of a VM-level limitation. (here: http://www.postgresql.org/message-id/CAK3UJREBcyVBtr8D7vMfU=uddkjxkrpngcuy8eryb0tmfke...@mail.gmail.com) And it makes me wonder what else may be issues that arise from that. But especially, what i can do about it. FWIW, I went back and re-read that message while perusing this thread, and this time it struck me that there was a significant bit of evidence I'd overlooked: namely, that the buffer block array is by no means the last thing in Postgres' shared memory segment. There are a bunch of other shared data structures allocated after it, some of which almost certainly had to have been touched by the startup subprocess. The gdb output makes it clear that the kernel stopped providing memory at 0xb6c4b000; but either it resumed doing so further on, or the whole shared memory segment *had* been provisioned originally, and then part of it got unmapped again while the startup process was running. So it's still clearly a kernel bug, but it seems less likely that it is triggered by some static limit on shared memory size. Perhaps instead, the kernel had been filling in pages for the shared segment on-demand, and then when it got to some limit it refused to do so anymore and allowed a SIGBUS to happen instead. I do not use openvz so I do not have a test bed to try out, but this page seems to be related to your problem: http://openvz.org/Resource_shortage or if you want more detail and a link to what looks to a replacement for beancounters: http://openvz.org/Setting_UBC_parameters If this software's idea of resource management is to allow SIGBUS to happen upon attempting to use memory that had been successfully granted, then it's a piece of junk that you should get rid of ASAP. (No, I don't like Linux's OOM-kill solution to resource overcommit either.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] char array overhead
I'm angling toward using a very wide char(1) array. Is the one-byte overhead for char(n126) applied to each element or to the array?
Re: [ADMIN][GENERAL] openvz and shared memory trouble
On 03/31/2014 08:01 AM, Tom Lane wrote: Adrian Klaver adrian.kla...@aklaver.com writes: On 03/31/2014 04:12 AM, Willy-Bas Loos wrote: I'm still worried that it's like Tom Lane said in another discussion:So basically, you've got a broken kernel here: it claimed to give PG circa (135MB) of memory, but what's actually there is only about (128MB). I don't see any connection between those numbers and the shmmax/shmall settings, either --- so I think this must be some busted implementation of a VM-level limitation. (here: http://www.postgresql.org/message-id/CAK3UJREBcyVBtr8D7vMfU=uddkjxkrpngcuy8eryb0tmfke...@mail.gmail.com) And it makes me wonder what else may be issues that arise from that. But especially, what i can do about it. FWIW, I went back and re-read that message while perusing this thread, and this time it struck me that there was a significant bit of evidence I'd overlooked: namely, that the buffer block array is by no means the last thing in Postgres' shared memory segment. There are a bunch of other shared data structures allocated after it, some of which almost certainly had to have been touched by the startup subprocess. The gdb output makes it clear that the kernel stopped providing memory at 0xb6c4b000; but either it resumed doing so further on, or the whole shared memory segment *had* been provisioned originally, and then part of it got unmapped again while the startup process was running. So it's still clearly a kernel bug, but it seems less likely that it is triggered by some static limit on shared memory size. Perhaps instead, the kernel had been filling in pages for the shared segment on-demand, and then when it got to some limit it refused to do so anymore and allowed a SIGBUS to happen instead. I do not use openvz so I do not have a test bed to try out, but this page seems to be related to your problem: http://openvz.org/Resource_shortage or if you want more detail and a link to what looks to a replacement for beancounters: http://openvz.org/Setting_UBC_parameters If this software's idea of resource management is to allow SIGBUS to happen upon attempting to use memory that had been successfully granted, then it's a piece of junk that you should get rid of ASAP. (No, I don't like Linux's OOM-kill solution to resource overcommit either.) At this point the memory allocation as a problem is as much conjecture as anything else, at least to me. So what is causing SIGBUS is an open question in my mind. regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN][GENERAL] openvz and shared memory trouble
Adrian Klaver adrian.kla...@aklaver.com writes: At this point the memory allocation as a problem is as much conjecture as anything else, at least to me. So what is causing SIGBUS is an open question in my mind. Agreed, it's unproven what's causing the SIGBUS in the case at hand. However, in the case I investigated for Josh Kupershmidt, it was provably the fault of the kernel for failing to supply memory to back the shared memory segment that it had previously agreed to allocate. The fact that both users are trying to run their standby server under OpenVZ is, shall we say, merely suggestive. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] char array overhead
On Mar 31, 2014, at 8:08 AM, Rob Sargent robjsarg...@gmail.com wrote: I'm angling toward using a very wide char(1) array. Is the one-byte overhead for char(n126) applied to each element or to the array? Each element, it's a variable length type. There's probably a better way of storing your data, but if you end up really needing a one-byte long character type, there is char (with the quotes). Cheers, Steve -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Alternative to Multi-Master Replication with 2 Data centers??
We are load balancing 2 data centers. Chapter 8 of Scalable Internet Architectures has a good discussion of running master-master setups in separate data centers. I'd read that whole chapter for some of the challenges you'll face. If DC1 goes down our LB is failing over to DC2. This sounds like it will bring down both databases. In general using the same machine for both load balancing and failover means that in practice you have no failover, because if one box goes down doubling the traffic will overwhelm the other one. If you want high availability you should have a separate warm standby in each datacenter, for four machines total. Otherwise you're just spending lots of time and money for the appearance of failover but not the reality. Or at least test it and make sure one failure won't cascade to the whole system. Good luck! Paul On Sat, Mar 29, 2014 at 11:35 AM, ethode jos...@ethode.com wrote: We are load balancing 2 data centers. Our current approach was using a software layer in our CMS to send data between data centers, but write/update frequency made this approach difficult and bug laden. Currently we're considering several options, of which Multi-master replication appears to be the top option. BOTH data centers need to be writable, otherwise we could use Master/Slave. If DC1 goes down our LB is failing over to DC2. The failure causing failover could be DB related OR be web server related. It doesn't appear to be realistic to keep both DC's updated on inserts and/or updates without using Multi-master or some other 3rd party software that appear to do the same thing as Multi-master. Any other solutions I should be considering -- View this message in context: http://postgresql.1045698.n5.nabble.com/Alternative-to-Multi-Master-Replication-with-2-Data-centers-tp5797886.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- _ Pulchritudo splendor veritatis. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] char array overhead
On 03/31/2014 09:48 AM, Steve Atkins wrote: On Mar 31, 2014, at 8:08 AM, Rob Sargent robjsarg...@gmail.com wrote: I'm angling toward using a very wide char(1) array. Is the one-byte overhead for char(n126) applied to each element or to the array? Each element, it's a variable length type. There's probably a better way of storing your data, but if you end up really needing a one-byte long character type, there is char (with the quotes). Cheers, Steve Thank you! First for the confirmation on the overhead (how disappointing) and secondly for the char. That had not crossed my radar. Jsyk, I'm toying with a rather large number of small valued datapoints per sample. I'm tying text and smallint as well. Thanks again, rjs
Re: [GENERAL] Complex query
-Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Leonardo M. Ramé Sent: Monday, March 31, 2014 2:38 PM To: PostgreSql-general Subject: [GENERAL] Complex query Hi, I'm looking for help with this query. Table Tasks: IdTask StatusCode StatusName -- 1 R Registered 1 S Started 1 D Dictated 1 F Finished 1 T Transcribed -- 2 R Registered 2 S Started 2 T Transcribed 2 F Finished As you can see, I have a table containing tasks and statuses. What I would like to get is the list of tasks, including all of its steps, for only those tasks where the StatusCode sequence was S followed by T. In this example, the query should only return task Nº 2: 2 R Registered 2 S Started 2 T Transcribed 2 F Finished Can anybody help me with this?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general Leonardo, Unless you add one more column to your Tasks table, specifically: StatusTimestamp as in: IdTask StatusCode StatusName StatusTimestamp You cannot find which record in the table follows which, because order in which records returned from the database is not guaranteed until you add ORDER BY clause to your SELECT statement. Regards, Igor Neyman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex query
On 31/03/2014 19:38, Leonardo M. Ramé wrote: Hi, I'm looking for help with this query. Table Tasks: IdTask StatusCode StatusName -- 1 R Registered 1 S Started 1 D Dictated 1 F Finished 1 T Transcribed -- 2 R Registered 2 S Started 2 T Transcribed 2 F Finished As you can see, I have a table containing tasks and statuses. What I would like to get is the list of tasks, including all of its steps, for only those tasks where the StatusCode sequence was S followed by T. How do you know the sequence in which the statuses occurred? Is there another column with a timestamp or something? Ray. -- Raymond O'Donnell :: Galway :: Ireland r...@iol.ie -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
An extension http://pgxn.org/dist/cyanaudit I'm working on dynamically creates trigger functions and installs them on tables in public. The triggers are automatically created when one of the extension's config tables is populated. Even though I mark the trigger *functions *as owned by my extension, the trigger definitions themselves are still dumped by pg_dump and restored by pg_restore. This is a problem when pg_restore is using parallelism (-j), and one thread starts issuing CREATE TRIGGER commands before the other thread has finished populating the config table. What ends up happening is pg_restore throws a whole lot of errors saying that the function referenced by the CREATE TRIGGER command does not exist. However, the function and trigger are actually created later on once the config table is populated. Using pg_restore without -j is a workaround, but I'd really like to be able to mark my extension's triggers as owned by the extension, so that these errors will not show up when restoring. Is there a better workaround that I'm not aware of? Thanks. Moshe Jacobson Manager of Systems Engineering, Nead Werx Inc. http://www.neadwerx.com 2323 Cumberland Parkway · Suite 201 · Atlanta, GA 30339 Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] Complex query
Leonardo M. Ramé-2 wrote Hi, I'm looking for help with this query. Table Tasks: IdTask StatusCode StatusName -- 1 R Registered 1 S Started 1 D Dictated 1 F Finished 1 T Transcribed -- 2 R Registered 2 S Started 2 T Transcribed 2 F Finished As you can see, I have a table containing tasks and statuses. What I would like to get is the list of tasks, including all of its steps, for only those tasks where the StatusCode sequence was S followed by T. In this example, the query should only return task Nº 2: 2 R Registered 2 S Started 2 T Transcribed 2 F Finished Can anybody help me with this?. First you need to decide how tell the database that R-S-T-F is ordered and then maybe you can use window functions, specifically lag(col, -1) over (...), to determine what the prior row's code is and act accordingly. Put that into a sub-query and return the IdTask to the outer query's where clause. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Complex-query-tp5798061p5798068.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex query
On 2014-03-31 18:48:58 +, Igor Neyman wrote: -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Leonardo M. Ramé Sent: Monday, March 31, 2014 2:38 PM To: PostgreSql-general Subject: [GENERAL] Complex query Hi, I'm looking for help with this query. Leonardo, Unless you add one more column to your Tasks table, specifically: StatusTimestamp as in: IdTask StatusCode StatusName StatusTimestamp You cannot find which record in the table follows which, because order in which records returned from the database is not guaranteed until you add ORDER BY clause to your SELECT statement. Regards, Igor Neyman You are right, let's add the Id column. This is just an example, the real table (a view) contains both, the Id and a timestamp: Id IdTask StatusCode StatusName -- 1 1 R Registered 2 1 S Started 3 1 D Dictated 4 1 F Finished 5 1 T Transcribed -- 6 2 R Registered 7 2 S Started 8 2 T Transcribed 9 2 F Finished After adding the Id column, can I use a window function to get what I need?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] getting the current query from pg_stat_activity
Hello, I have two different postgresql servers running slightly versions. On one them, if I try to use pg_stat_activity to get the current queries, I get 1$ psql psql (9.0.13) Type help for help. postgres=# select procpid, query_start, waiting, current_query from pg_stat_activity; procpid | query_start | waiting | current_query -+---+-+ 673 | 2014-03-31 11:45:45.38988-07 | f | IDLE 855 | 2014-03-31 11:45:45.478935-07 | f | IDLE ... This agrees with the results of $ ps auxw | grep postgres postgres 673 0.3 0.3 243028 55348 ?Ss Mar30 2:25 postgres: pguser databasename 127.0.0.1(53931) idle postgres 855 0.3 0.3 243304 57584 ?Ss Mar30 2:49 postgres: pguser databasename 127.0.0.1(53981) idle which shows that the processes are idle. On the other one, though, $ psql psql (9.2.6) Type help for help. postgres=select pid, query_start, waiting, query from pg_stat_activity; # pid | query_start | waiting | query 12333 | 2014-03-31 14:32:30.810934-04 | f | SELECT... 12376 | 2014-03-31 14:48:08.338419-04 | f | COMMIT 12405 | 2014-03-31 14:52:22.903848-04 | f | COMMIT 12406 | 2014-03-31 14:32:48.150378-04 | f | SELECT which is strange, because the processes show they are idle, postgres 12333 0.0 1.8 3437696 279736 ? Ss 14:31 0:00 postgres: opentaps databasename 127.0.0.1(37969) idle postgres 12376 5.0 9.7 3473184 1491196 ? Ss 14:32 1:05 postgres: opentaps databasename 127.0.0.1(38025) idle postgres 12405 1.5 6.5 3467624 1007160 ? Ss 14:32 0:19 postgres: opentaps databasename 127.0.0.1(38085) idle postgres 12406 0.0 0.0 3432512 13024 ? Ss 14:32 0:00 postgres: opentaps databasename 127.0.0.1(38100) idle it seems that there is also a difference between the pg_stat_activity table of version 9.0.13: \d pg_stat_activity; View pg_catalog.pg_stat_activity Column | Type | Modifiers --+--+--- datid| oid | datname | name | procpid | integer | usesysid | oid | usename | name | application_name | text | client_addr | inet | client_port | integer | backend_start| timestamp with time zone | xact_start | timestamp with time zone | query_start | timestamp with time zone | waiting | boolean | current_query| text | vs 9.2.6: View pg_catalog.pg_stat_activity Column | Type | Modifiers --+--+--- datid| oid | datname | name | pid | integer | usesysid | oid | usename | name | application_name | text | client_addr | inet | client_hostname | text | client_port | integer | backend_start| timestamp with time zone | xact_start | timestamp with time zone | query_start | timestamp with time zone | state_change | timestamp with time zone | waiting | boolean | state| text | query| text | So which one is correct? Why does 9.0.13 show the processes as idle, and 9.2.6 show a query, even though the process shows them as idle? -- Si Chen Open Source Strategies, Inc. sic...@opensourcestrategies.com http://www.OpenSourceStrategies.com LinkedIn: http://www.linkedin.com/in/opentaps Twitter: http://twitter.com/opentaps
Re: [GENERAL] Complex query
-Original Message- From: Leonardo M. Ramé [mailto:l.r...@griensu.com] Sent: Monday, March 31, 2014 2:56 PM To: Igor Neyman Cc: PostgreSql-general Subject: Re: [GENERAL] Complex query On 2014-03-31 18:48:58 +, Igor Neyman wrote: -Original Message- From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general- ow...@postgresql.org] On Behalf Of Leonardo M. Ramé Sent: Monday, March 31, 2014 2:38 PM To: PostgreSql-general Subject: [GENERAL] Complex query Hi, I'm looking for help with this query. Leonardo, Unless you add one more column to your Tasks table, specifically: StatusTimestamp as in: IdTask StatusCode StatusName StatusTimestamp You cannot find which record in the table follows which, because order in which records returned from the database is not guaranteed until you add ORDER BY clause to your SELECT statement. Regards, Igor Neyman You are right, let's add the Id column. This is just an example, the real table (a view) contains both, the Id and a timestamp: Id IdTask StatusCode StatusName -- 1 1 R Registered 2 1 S Started 3 1 D Dictated 4 1 F Finished 5 1 T Transcribed -- 6 2 R Registered 7 2 S Started 8 2 T Transcribed 9 2 F Finished After adding the Id column, can I use a window function to get what I need?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 D.Johnston showed how to use windows function in this case. Regards, Igor Neyman -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] char array overhead
Rob Sargent wrote: Jsyk, I'm toying with a rather large number of small valued datapoints per sample. I'm tying text and smallint as well. You could try char[] ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] getting the current query from pg_stat_activity
From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Si Chen Sent: Monday, March 31, 2014 2:57 PM To: pgsql-general@postgresql.org Subject: [GENERAL] getting the current query from pg_stat_activity Hello, I have two different postgresql servers running slightly versions. On one them, if I try to use pg_stat_activity to get the current queries, I get 1$ psql psql (9.0.13) Type help for help. postgres=# select procpid, query_start, waiting, current_query from pg_stat_activity; procpid | query_start | waiting | current_query -+---+-+ 673 | 2014-03-31 11:45:45.38988-07 | f | IDLE 855 | 2014-03-31 11:45:45.478935-07 | f | IDLE ... This agrees with the results of $ ps auxw | grep postgres postgres 673 0.3 0.3 243028 55348 ? Ss Mar30 2:25 postgres: pguser databasename 127.0.0.1(53931) idle postgres 855 0.3 0.3 243304 57584 ? Ss Mar30 2:49 postgres: pguser databasename 127.0.0.1(53981) idle which shows that the processes are idle. On the other one, though, $ psql psql (9.2.6) Type help for help. postgres=select pid, query_start, waiting, query from pg_stat_activity; # pid | query_start | waiting | query 12333 | 2014-03-31 14:32:30.810934-04 | f | SELECT... 12376 | 2014-03-31 14:48:08.338419-04 | f | COMMIT 12405 | 2014-03-31 14:52:22.903848-04 | f | COMMIT 12406 | 2014-03-31 14:32:48.150378-04 | f | SELECT which is strange, because the processes show they are idle, postgres 12333 0.0 1.8 3437696 279736 ? Ss 14:31 0:00 postgres: opentaps databasename 127.0.0.1(37969) idle postgres 12376 5.0 9.7 3473184 1491196 ? Ss 14:32 1:05 postgres: opentaps databasename 127.0.0.1(38025) idle postgres 12405 1.5 6.5 3467624 1007160 ? Ss 14:32 0:19 postgres: opentaps databasename 127.0.0.1(38085) idle postgres 12406 0.0 0.0 3432512 13024 ? Ss 14:32 0:00 postgres: opentaps databasename 127.0.0.1(38100) idle it seems that there is also a difference between the pg_stat_activity table of version 9.0.13: \d pg_stat_activity; View pg_catalog.pg_stat_activity Column | Type | Modifiers --+--+--- datid | oid | datname | name | procpid | integer | usesysid | oid | usename | name | application_name | text | client_addr | inet | client_port | integer | backend_start | timestamp with time zone | xact_start | timestamp with time zone | query_start | timestamp with time zone | waiting | boolean | current_query | text | vs 9.2.6: View pg_catalog.pg_stat_activity Column | Type | Modifiers --+--+--- datid | oid | datname | name | pid | integer | usesysid | oid | usename | name | application_name | text | client_addr | inet | client_hostname | text | client_port | integer | backend_start | timestamp with time zone | xact_start | timestamp with time zone | query_start | timestamp with time zone | state_change | timestamp with time zone | waiting | boolean | state | text | query | text | So which one is correct? Why does 9.0.13 show the processes as idle, and 9.2.6 show a query, even though the process shows them as idle? -- Si Chen Open Source Strategies, Inc. sic...@opensourcestrategies.com http://www.OpenSourceStrategies.com LinkedIn: http://www.linkedin.com/in/opentaps Twitter: http://twitter.com/opentaps First, 9.0 and 9.2 are not slightly different, there are two different major releases. Second, both are right. It's just that for IDLE processes (state column) 9.2 shows the last query executed before process became IDLE. 9.0
[GENERAL] Complex query
Hi, I'm looking for help with this query. Table Tasks: IdTask StatusCode StatusName -- 1 R Registered 1 S Started 1 D Dictated 1 F Finished 1 T Transcribed -- 2 R Registered 2 S Started 2 T Transcribed 2 F Finished As you can see, I have a table containing tasks and statuses. What I would like to get is the list of tasks, including all of its steps, for only those tasks where the StatusCode sequence was S followed by T. In this example, the query should only return task Nº 2: 2 R Registered 2 S Started 2 T Transcribed 2 F Finished Can anybody help me with this?. Regards, -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
Moshe Jacobson wrote: Even though I mark the trigger *functions *as owned by my extension, the trigger definitions themselves are still dumped by pg_dump and restored by pg_restore. This is a problem when pg_restore is using parallelism (-j), and one thread starts issuing CREATE TRIGGER commands before the other thread has finished populating the config table. ISTM that the usual locution for this is ALTER EXTENSION .. ADD. You could test whether this is going to work by manually inserting rows in pg_depend. It seems strange to me that the created trigger is part of the extension, however. Maybe it's the right fix, but hmm. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex query
On 2014-03-31 11:46:28 -0700, David Johnston wrote: Leonardo M. Ramé-2 wrote Hi, I'm looking for help with this query. Table Tasks: IdTask StatusCode StatusName -- 1 R Registered 1 S Started 1 D Dictated 1 F Finished 1 T Transcribed -- 2 R Registered 2 S Started 2 T Transcribed 2 F Finished As you can see, I have a table containing tasks and statuses. What I would like to get is the list of tasks, including all of its steps, for only those tasks where the StatusCode sequence was S followed by T. In this example, the query should only return task Nº 2: 2 R Registered 2 S Started 2 T Transcribed 2 F Finished Can anybody help me with this?. First you need to decide how tell the database that R-S-T-F is ordered and then maybe you can use window functions, specifically lag(col, -1) over (...), to determine what the prior row's code is and act accordingly. Put that into a sub-query and return the IdTask to the outer query's where clause. David J. Thanks David, I hope I understood what you mean. After adding the Id column, I came up with this query: ris=# select lag.id, lag.idtask, lag.code, lag.lg from (select idtask, code, id, lag(code, -1) over () as lg from tasks_test) as lag; id | idtask | code | lg ++--+ 1 | 1 | R| S 2 | 1 | S| D 3 | 1 | D| F 4 | 1 | F| T 5 | 1 | T| R 6 | 2 | R| S 7 | 2 | S| T 8 | 2 | T| F 9 | 2 | F| (9 rows) Row nº 7 meets the condition, but I don't want to show only that row, I would like to show this: 6 | 2 | R| S 7 | 2 | S| T 8 | 2 | T| F 9 | 2 | F| Any hint?. -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex query
Leonardo M. Ramé-2 wrote select lag.id, lag.idtask, lag.code, lag.lg from (select idtask, code, id, lag(code, -1) over () as lg from tasks_test) as lag First you want to include an ORDER BY in the OVER(...) clause, and probably a PARTITION BY as well. Then you move that to a sub-query (for example): SELECT * FROM tbl WHERE tbl.idtask IN ( SELECT lag.idtask FROM ( lag_query_here ) lag WHERE lag.code = 'T' and lag.lg = 'S' ); David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Complex-query-tp5798061p5798087.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [ADMIN][GENERAL] openvz and shared memory trouble
On 03/31/2014 08:28 AM, Tom Lane wrote: Adrian Klaver adrian.kla...@aklaver.com writes: At this point the memory allocation as a problem is as much conjecture as anything else, at least to me. So what is causing SIGBUS is an open question in my mind. Agreed, it's unproven what's causing the SIGBUS in the case at hand. However, in the case I investigated for Josh Kupershmidt, it was provably the fault of the kernel for failing to supply memory to back the shared memory segment that it had previously agreed to allocate. The fact that both users are trying to run their standby server under OpenVZ is, shall we say, merely suggestive. Yes, the fact that the problem seems to go away with a change in shared_buffers seems to point in that direction also. To get through the smoke to the fire, there are more questions to be answered: 1) The OP says the shared_buffered setting that failed was 4GB and that 'tuning' down the setting solved the problem. So it would be nice to know what setting worked? 2) In hand with 1), the memory settings for the virtualization host and the container(s) involved are? 3) What is the host OS? 4) What version of OpenVZ? 5) What is the layout? Are the primary and standby on the same container, different containers or some other arrangement? In other words could there be some other resource conflict going on, say network? 6) Are the crashes random or do they follow some pattern? regards, tom lane -- Adrian Klaver adrian.kla...@aklaver.com -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Complex query
On 2014-03-31 12:16:53 -0700, David Johnston wrote: Leonardo M. Ramé-2 wrote select lag.id, lag.idtask, lag.code, lag.lg from (select idtask, code, id, lag(code, -1) over () as lg from tasks_test) as lag First you want to include an ORDER BY in the OVER(...) clause, and probably a PARTITION BY as well. Then you move that to a sub-query (for example): SELECT * FROM tbl WHERE tbl.idtask IN ( SELECT lag.idtask FROM ( lag_query_here ) lag WHERE lag.code = 'T' and lag.lg = 'S' ); David J. Great!, that's what I needed, thank you. -- Leonardo M. Ramé Medical IT - Griensu S.A. Av. Colón 636 - Piso 8 Of. A X5000EPT -- Córdoba Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19 Cel.: +54 9 (011) 40871877 -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Getting sequence-generated IDs from multiple row insert
Hi folks, I've just run into a subtle but fairly serious race condition while using web.py's SQL library to insert multiple rows into the database and return their IDs (a serial primary key column). Specifically I'm using the multiple_insert() function that web.py defines here: https://github.com/webpy/webpy/blob/master/web/db.py#L793 This function runs a query like this: INSERT INTO table (col1, col2) VALUES (col1_1, col2_1), (col1_2, col2_2), ...; SELECT currval('table_id_seq'); Using the output of the currval(), web.py tries to build a list of the most recent IDs by creating a range from currval - num_rows_inserted + 1 through currval. In Python: out = range(out-len(values)+1, out+1) This *looks* nice, and must have seemed fine to the developers who implemented it, but I've just hit a case where two sessions each doing a multiple insert don't use sequential IDs. For example, the range code above for the first insert gave 2117552...2117829. And the second insert gave 2117625...2117818. Which are obviously overlapping and is a nasty bug waiting to happen. Thankfully it caused an IntegrityError further down in my code so I didn't screw things up. First of all, I presume this is expected, and is how the sequence with a multi-row insert is supposed to work? In other words, the sequence guarantees the IDs will be unique, but with multi-row insert, they won't necessarily be consecutive? If so, it's a fairly serious bug in web.py's multiple_insert(), which probably shouldn't return anything due to this issue. Second, what's the right thing to do here? The first thing I found was PostgreSQL's RETURNING clause, but somewhat frustratingly for this use case, even that's not guaranteed to return the results in the order you specified. I need the IDs in insertion order so I can do further processing. Tom Lane and others in this thread indicate that this is not a guarantee of the RETURNING clause, for future optimization reasons and due to how SQL handles sets: http://postgresql.1045698.n5.nabble.com/PATCH-Enforce-that-INSERT-RETURNING-preserves-the-order-of-multi-rows-td5728579.html So currently I've changed my code to use RETURNING and then I'm ordering the results based on a secondary column that I know the order of. This works, but seems clunky, so I'm wondering if there's a nicer way. Thanks, Ben
Re: [GENERAL] Why does checkpointer is consumig ~1.2Gb of RAM?
Em 31/03/2014 00:38, Scott Marlowe escreveu: On Sun, Mar 30, 2014 at 8:43 PM, Edson Richter edsonrich...@hotmail.com wrote: I'm curious about the "checkpointer" process and its configuration. What are the configuration options that affects the checkpointer process? Currently, under load, this process goes up to about 1.2GB of RAM: -- No it really doesn't. VIRT is every thing it touches whether it uses it individually or with other processes. RES is what THIS process is using all by itself. SHR is what it's accessing of shared memory. Here's a short explanation of what those three values mean: http://linuxpoison.blogspot.com/2009/10/what-is-difference-among-virt-res-and.html Also for a more technical one run "man top" and search for RES, SHR, and VIRT Yes, I understand that. That's why I've asked: why is checkpointer process consuming 1.2Gb of RAM (1215M RES more precisely), and which parameter affects its memory consumption? Thanks, Edson -- Edson Carlos Ericksson Richter Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa Celular: (51) 9318-9766 (51) 8585-0796 "A mente que se abre a uma nova ideia jamais voltar ao seu tamanho original" - Albert Einstein
Re: [GENERAL] Getting sequence-generated IDs from multiple row insert
On Mon, Mar 31, 2014 at 03:28:14PM -0400, Ben Hoyt wrote: , but I've just hit a case where two sessions each doing a multiple insert don't use sequential IDs. For example, the range code above for the first insert gave 2117552...2117829. And the second insert gave 2117625...2117818. Which are obviously overlapping and is a nasty bug waiting to happen. Thankfully it caused an IntegrityError further down in my code so I didn't screw things up. Good thing you caught it. But yes, just from the description it looked like an obvious race to me. Concurrency is hard. Second, what's the right thing to do here? The first thing I found was PostgreSQL's RETURNING clause, but somewhat frustratingly for this use case, even that's not guaranteed to return the results in the order you specified. In SQL, _nothing_ is guaranteed to return in the order you specified. This isn't really a Postgres thing; unless you use ORDER BY, SQL's sets are not ordered. I need the IDs in insertion order so I can do further processing. This sets off alarm bells for me. What further processing are you doing? Is it possible that you could move that into a single step in the database (maybe with a function or even a trigger) so that the result of your RETURNING really would provide you with what you need? So currently I've changed my code to use RETURNING and then I'm ordering the results based on a secondary column that I know the order of. This works, but seems clunky, so I'm wondering if there's a nicer way. This is probably what I'd do, assuming that further processing isn't more data transformation. If it _is_, then I'd do the whole thing in a single step (in the database, once I inserted). A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] getting the current query from pg_stat_activity
Si Chen-2 wrote I have two different postgresql servers running slightly [different] versions. Versions 9.0 and 9.2 are NOT slightly different. These are two MAJOR RELEASES (which allow for API changes) apart (i.e., one major release in between - 9.1) The release notes for 9.2 note this particular change explicitly: http://www.postgresql.org/docs/9.2/interactive/release-9-2.html Section E.9.2.6 Note the presence of the state column in the 9.2 schema - you use this to determine if a connection is idle instead of looking for IDLE in a query column which then allows the query column to be report the last known query at all times. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/getting-the-current-query-from-pg-stat-activity-tp5798076p5798098.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Why does checkpointer is consumig ~1.2Gb of RAM?
It's not. If the RES shows 1215M And the SHR is 1214M, then checkpointer is only using 1M. The difference between the two is the shared memory. RES includes SHR. On Mon, Mar 31, 2014 at 1:44 PM, Edson Richter edsonrich...@hotmail.comwrote: Em 31/03/2014 00:38, Scott Marlowe escreveu: On Sun, Mar 30, 2014 at 8:43 PM, Edson Richter edsonrich...@hotmail.comwrote: I'm curious about the checkpointer process and its configuration. What are the configuration options that affects the checkpointer process? Currently, under load, this process goes up to about 1.2GB of RAM: -- No it really doesn't. VIRT is every thing it touches whether it uses it individually or with other processes. RES is what THIS process is using all by itself. SHR is what it's accessing of shared memory. Here's a short explanation of what those three values mean: http://linuxpoison.blogspot.com/2009/10/what-is-difference-among-virt-res-and.html Also for a more technical one run man top and search for RES, SHR, and VIRT Yes, I understand that. That's why I've asked: why is checkpointer process consuming 1.2Gb of RAM (1215M RES more precisely), and which parameter affects its memory consumption? Thanks, Edson -- *Edson Carlos Ericksson Richter* *Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa* Celular: (51) 9318-9766 (51) 8585-0796 *A mente que se abre a uma nova ideia jamais voltará ao seu tamanho original* - Albert Einstein -- To understand recursion, one must first understand recursion. inline: image/png
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
Alvaro Herrera alvhe...@2ndquadrant.com writes: Moshe Jacobson wrote: Even though I mark the trigger *functions *as owned by my extension, the trigger definitions themselves are still dumped by pg_dump and restored by pg_restore. This is a problem when pg_restore is using parallelism (-j), and one thread starts issuing CREATE TRIGGER commands before the other thread has finished populating the config table. ISTM that the usual locution for this is ALTER EXTENSION .. ADD. You could test whether this is going to work by manually inserting rows in pg_depend. It seems strange to me that the created trigger is part of the extension, however. Maybe it's the right fix, but hmm. I'm suspicious that the problem is exactly lack of pg_depend records --- pg_dump/pg_restore relies on those for correct ordering of parallel operations. What method are you using to create these triggers (not the functions, the per-table pg_trigger records)? regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] getting the current query from pg_stat_activity
Thanks! That's very helpful and answers my question. On Mon, Mar 31, 2014 at 12:52 PM, David Johnston pol...@yahoo.com wrote: Si Chen-2 wrote I have two different postgresql servers running slightly [different] versions. Versions 9.0 and 9.2 are NOT slightly different. These are two MAJOR RELEASES (which allow for API changes) apart (i.e., one major release in between - 9.1) The release notes for 9.2 note this particular change explicitly: http://www.postgresql.org/docs/9.2/interactive/release-9-2.html Section E.9.2.6 Note the presence of the state column in the 9.2 schema - you use this to determine if a connection is idle instead of looking for IDLE in a query column which then allows the query column to be report the last known query at all times. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/getting-the-current-query-from-pg-stat-activity-tp5798076p5798098.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Si Chen Open Source Strategies, Inc. sic...@opensourcestrategies.com http://www.OpenSourceStrategies.com LinkedIn: http://www.linkedin.com/in/opentaps Twitter: http://twitter.com/opentaps
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
On Mon, Mar 31, 2014 at 4:20 PM, Tom Lane t...@sss.pgh.pa.us wrote: I'm suspicious that the problem is exactly lack of pg_depend records --- pg_dump/pg_restore relies on those for correct ordering of parallel operations. What method are you using to create these triggers (not the functions, the per-table pg_trigger records)? There is a trigger function on the extension's config table that creates the trigger functions and installs them (both from the same function). I am about to try playing with pg_depend to see if it has the desired effect. Moshe Jacobson Manager of Systems Engineering, Nead Werx Inc. http://www.neadwerx.com 2323 Cumberland Parkway · Suite 201 · Atlanta, GA 30339 Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] Getting sequence-generated IDs from multiple row insert
Andrew Sullivan-8 wrote So currently I've changed my code to use RETURNING and then I'm ordering the results based on a secondary column that I know the order of. This works, but seems clunky, so I'm wondering if there's a nicer way. This is probably what I'd do, assuming that further processing isn't more data transformation. If it _is_, then I'd do the whole thing in a single step (in the database, once I inserted). If order is an implicit property of the source data then you need to explicitly encode that order during (or before) import. There are numerous ways to implement such but except for extremely simple cases PostgreSQL will not do the appropriate thing automatically in the face of concurrency. Also, do you need sequential IDs or just IDs that are ever increasing? And if the later then tagging the input source will let you distinguish between two different datasets even if their sequences are overlapping. David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/Getting-sequence-generated-IDs-from-multiple-row-insert-tp5798092p5798107.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
On Mon, Mar 31, 2014 at 4:32 PM, Moshe Jacobson mo...@neadwerx.com wrote: There is a trigger function on the extension's config table that creates the trigger functions and installs them (both from the same function). I am about to try playing with pg_depend to see if it has the desired effect. I've added the requisite rows to pg_depend, and I know it was correct because \dx+ cyanaudit (my extension) now shows all of the triggers. However, pg_dump still dumps them, and pg_restore still restores them, causing the same errors as I had before. Moshe Jacobson Manager of Systems Engineering, Nead Werx Inc. http://www.neadwerx.com 2323 Cumberland Parkway · Suite 201 · Atlanta, GA 30339 Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] Why does checkpointer is consumig ~1.2Gb of RAM?
Em 31/03/2014 17:06, Scott Marlowe escreveu: It's not. If the RES shows 1215M And the SHR is 1214M, then checkpointer is only using 1M. The difference between the two is the shared memory. RES includes SHR. Thanks, now I understand. Regards, Edson On Mon, Mar 31, 2014 at 1:44 PM, Edson Richter edsonrich...@hotmail.com wrote: Em 31/03/2014 00:38, Scott Marlowe escreveu: On Sun, Mar 30, 2014 at 8:43 PM, Edson Richter edsonrich...@hotmail.com wrote: I'm curious about the "checkpointer" process and its configuration. What are the configuration options that affects the checkpointer process? Currently, under load, this process goes up to about 1.2GB of RAM: -- No it really doesn't. VIRT is every thing it touches whether it uses it individually or with other processes. RES is what THIS process is using all by itself. SHR is what it's accessing of shared memory. Here's a short explanation of what those three values mean: http://linuxpoison.blogspot.com/2009/10/what-is-difference-among-virt-res-and.html Also for a more technical one run "man top" and search for RES, SHR, and VIRT Yes, I understand that. That's why I've asked: why is checkpointer process consuming 1.2Gb of RAM (1215M RES more precisely), and which parameter affects its memory consumption? Thanks, Edson -- Edson Carlos Ericksson Richter Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa Celular: (51) 9318-9766 (51) 8585-0796 "A mente que se abre a uma nova ideia jamais voltar ao seu tamanho original" - Albert Einstein -- To understand recursion, one must first understand recursion. -- Edson Carlos Ericksson Richter Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa Celular: (51) 9318-9766 (51) 8585-0796 "A mente que se abre a uma nova ideia jamais voltar ao seu tamanho original" - Albert Einstein
Re: [GENERAL] Why does checkpointer is consumig ~1.2Gb of RAM?
And yet I still got it wrong. Bad day for me. Should be: The difference between the two is the memory it's using. Bad day apparently. On Mon, Mar 31, 2014 at 2:46 PM, Edson Richter edsonrich...@hotmail.comwrote: Em 31/03/2014 17:06, Scott Marlowe escreveu: It's not. If the RES shows 1215M And the SHR is 1214M, then checkpointer is only using 1M. The difference between the two is the shared memory. RES includes SHR. Thanks, now I understand. Regards, Edson On Mon, Mar 31, 2014 at 1:44 PM, Edson Richter edsonrich...@hotmail.comwrote: Em 31/03/2014 00:38, Scott Marlowe escreveu: On Sun, Mar 30, 2014 at 8:43 PM, Edson Richter edsonrich...@hotmail.comwrote: I'm curious about the checkpointer process and its configuration. What are the configuration options that affects the checkpointer process? Currently, under load, this process goes up to about 1.2GB of RAM: -- No it really doesn't. VIRT is every thing it touches whether it uses it individually or with other processes. RES is what THIS process is using all by itself. SHR is what it's accessing of shared memory. Here's a short explanation of what those three values mean: http://linuxpoison.blogspot.com/2009/10/what-is-difference-among-virt-res-and.html Also for a more technical one run man top and search for RES, SHR, and VIRT Yes, I understand that. That's why I've asked: why is checkpointer process consuming 1.2Gb of RAM (1215M RES more precisely), and which parameter affects its memory consumption? Thanks, Edson -- *Edson Carlos Ericksson Richter* *Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa* Celular: (51) 9318-9766 %2851%29%209318-9766 (51) 8585-0796 %2851%29%208585-0796 *A mente que se abre a uma nova ideia jamais voltará ao seu tamanho original* - Albert Einstein -- To understand recursion, one must first understand recursion. -- *Edson Carlos Ericksson Richter* *Sistemas para Bibliotecas, Escolinhas Infantis Projetos sob medida para sua empresa* Celular: (51) 9318-9766 (51) 8585-0796 *A mente que se abre a uma nova ideia jamais voltará ao seu tamanho original* - Albert Einstein -- To understand recursion, one must first understand recursion. inline: image/png
Re: [GENERAL] Getting sequence-generated IDs from multiple row insert
On Mon, Mar 31, 2014 at 01:34:04PM -0700, David Johnston wrote: If order is an implicit property of the source data then you need to explicitly encode that order during (or before) import. Sure, but the problem the OP had I thought was that the RETURNING clause doesn't guarantee that the rows coming back are in the order they were inserted. This is just a SQL thing. (I guess you could ORDER BY the RETURNING clause, right?) There are numerous ways to implement such but except for extremely simple cases PostgreSQL will not do the appropriate thing automatically in the face of concurrency. It _is_ doing the appropriate thing, though: this is SQL. The rows aren't ordered unless you tell them to be. A -- Andrew Sullivan a...@crankycanuck.ca -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
Moshe Jacobson mo...@neadwerx.com writes: I've added the requisite rows to pg_depend, and I know it was correct because \dx+ cyanaudit (my extension) now shows all of the triggers. However, pg_dump still dumps them, and pg_restore still restores them, causing the same errors as I had before. I don't think pg_dump believes that pg_trigger rows can belong to an extension; and I'm dubious of the concept too. What I was wondering about was whether those rows had proper dependencies on (a) the functions and (b) their owning tables. Basically what you need to end up with is * trigger function has a membership dependency on the extension * pg_trigger row has a normal dependency on the trigger function it uses * pg_trigger row has an auto dependency on the table it's for If you're using SQL commands to create the trigger then I'd expect the latter two to be handled automatically; but it sorta sounds like you're doing something pretty low-level and perhaps omitting these steps. Note: depending on what it is you're trying to accomplish, it might be saner for the pg_trigger rows to have auto dependencies on their trigger functions. Depends whether you'd like DROP EXTENSION to complain or just shut up and drop the triggers. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Wanted: ALTER TRIGGER ... OWNED BY EXTENSION
On Mon, Mar 31, 2014 at 5:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: Basically what you need to end up with is * trigger function has a membership dependency on the extension Yes, the dependency is set up when the trigger function is dynamically created by using ALTER EXTENSION ... ADD FUNCTION * pg_trigger row has a normal dependency on the trigger function it uses Aha, this is not present. I think it is due to some migration magic I did a while ago. Is there a way to clean up the dependencies, or at least list out the dependencies that seem suspicious? * pg_trigger row has an auto dependency on the table it's for This is present. Note: depending on what it is you're trying to accomplish, it might be saner for the pg_trigger rows to have auto dependencies on their trigger functions. Depends whether you'd like DROP EXTENSION to complain or just shut up and drop the triggers. I would love for DROP EXTENSION to shut up and drop the triggers, but I'm not sure how to accomplish that without manually manipulating pg_depend (and even then I'm not sure if that'll do it). Suggestions welcome. Thanks. Moshe Jacobson Manager of Systems Engineering, Nead Werx Inc. http://www.neadwerx.com 2323 Cumberland Parkway · Suite 201 · Atlanta, GA 30339 Quality is not an act, it is a habit. -- Aristotle
Re: [GENERAL] Postgres as In-Memory Database?
On Sunday, March 30, 2014, Stefan Keller sfkel...@gmail.com wrote: Hi Jeff 2013/11/20 Jeff Janes jeff.ja...@gmail.comjavascript:_e(%7B%7D,'cvml','jeff.ja...@gmail.com'); I don't know what you mean about enhancements in the buffer pool. For an in-memory database, there shouldn't be a buffer pool in the first place, as it is *all* in memory. You are right: In-memory DBs are making buffer-pooling obsolete - except for making data persistent (see below). I would be very reluctant to use any database engine which considered disk access obsolete. There will always be a risk where data grows to exceed RAM, and where it would be inconvenient to expand RAM fast enough to accommodate it. I've played those games enough with Perl and C in-memory systems. You fight and squeeze to fit the data into RAM, then the data size grows 3% and all of our work is for naught. You can buy more RAM, if you have the budget, and the RAM isn't back-ordered for 3 months because the factory that makes it had a fire, and if more RAM fits on your motherboard, and Do you know why it is slow? I'd give high odds that it would be a specific implementation detail in the code that is suboptimal, or maybe a design decision of PostGIS, rather than some high level architectural decision of PostgreSQL. Referring to the application is something you can always say - but shouldn't prevent on enhancing Postgres. Postgres has been enhanced. Now we need to change osm2pgsql to take advantage of them. It defines indexes on the tables that are going to be bulk loaded with COPY, which defeats some recent optimizations made to COPY. The creation of the indexes should be delayed until after the bulk load is done. A further enhancement to Postgres would be would be to automatically defer creation of the indexes when a table is truncated or created within a transaction, so that users get the benefit of the improvement These enhancements to osm2pgsql seem to be reasonable to me. I hope somebody has time to care about. I have a fork of osm2pgsql on github which delays the index build until the COPY is done. I'm not really motivated to convince anyone to merge it (as my interest is postgresql not osm itself), but if someone wants to pick it up, that is fine with me. It helps somewhat, but it is not a game-changer because there are other bigger bottlenecks, at least for HDD based systems. One of the bigger bottlenecks is building the GIN indexes on the way table at the end. Setting maintenance_work_mem to huge values helps a lot, if you can find a safe setting for it considering multiple index builds it might be doing (at that point in the load, osm2pgsql's node cache has been released, so there is substantial RAM to re-purpose). It would be better for this use if PostgreSQL built the index by using an external sort, rather than iterating over the table building maintenance_work_mem sized chunks of red-black trees. The problem there is that osm uses the gin index in an odd way (the vast majority of nodes occur in exactly one way, with a minority occurring in more than one), and using a disk sort might not be ideal for the more common use cases where GIN is used, where a given token usually occurs in far more than one document. So an improvement that only improves osm2pgsql and degrades other uses is unlikely to be adopted. Another bottleneck is just the raw COPY into the node table. When that is running against an unindexed table which was created in the same transaction, I see that osm2pgsql takes about 50% of a CPU to print a copy-stream, and postgresql uses about 50% of a CPU to parse that stream and insert into the table. So they add up to about 1 CPU despite the fact this a is multiple CPU machine. So they seem to be playing ping-pong with the pipe buffer when in theory they should each by able to run at almost full speed. I don't know how to get it stop playing ping-pong, but I have other use cases where this shows up, so trade-off-free solution would be nifty. I suspect that that is more of a kernel issue than either postgresql or osm2pgsql. You could do the COPY in parallel in multiple threads, but the problem there is you can't use the created in same transaction optimization to avoid WAL overhead. There is no fix to this without changing PostgreSQL to accommodate it, but i have no clear idea how one would do that. Importing a snapshot doesn't seem like it would be enough, as you can only import snapshots for reading, not for writing. Also, I'm not too sure how seriously to take the goal of optimizing osm2pgsql. Development on it seems to be less than vigorous. And its purpose is to create a database to be used, so wouldn't it make more sense to optimize the use, not the creation? And if you do want to optimize the creation, the obvious way to do it so to create the export in a way more closely aligned to that need, rather than a generic export. In the meantime I
Re: [GENERAL] Doubts on startup costs
Start-up cost is the cost required to fetch the first tuple. So yes it is possible for the startup cost of innermost node/leaf node to be zero as show in below example:. postgres=# explain select * from tbl,tbl2 where tbl2.id=tbl.id order by tbl.id; QUERY PLAN -- Merge Join (cost=809.81..1636.50 rows=11571 width=8) Merge Cond: (tbl.id = tbl2.id) - Index Only Scan using idx on tbl (cost=0.42..3369.01 rows=11 width=4) - Sort (cost=809.39..834.39 rows=1 width=4) Sort Key: tbl2.id - Seq Scan on tbl2 (cost=0.00..145.00 rows=1 width=4) Planning time: 0.672 ms postgres=# explain select * from tbl order by id; QUERY PLAN Index Only Scan using idx on tbl (cost=0.42..3369.01 rows=11 width=4) Planning time: 0.305 ms Also start-up cost of outer node need not be more than total cost of inner nodes. If it is possible for outer nodes to emit one tuple without waiting for complete operation to happen by inner nodes, then outer node start-up cost will be much lesser than total cost by inner nodes. But start-up cost of outer node cannot be less the start-up cost of inner nodes. For example in above example plan, a merge join can emit one tuple as soon as it finds one matching row. So start-up cost is lesser. Thanks and Regards, Kumar Rajeev Rastogi -- This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! From: pgsql-general-ow...@postgresql.org [mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Jayadevan M Sent: 30 March 2014 20:45 To: pgsql-general@postgresql.org Subject: [GENERAL] Doubts on startup costs Hi, A few academic questions related PostgreSQL query planner and output - In the output of EXPLAIN for SQL statements I have seen so far, the startup cost for the innermost node/leaf has been 0. Are there situations where it may be non-zero? The startup costs for outer nodes will always be equal to or greater than the total cost of the inner nodes? (My guess is NO, there may be cases where the outer node can start processing before the inner node is completely done). Regards, Jayadevan