[GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?
I have a relatively simple data load script, which upserts (UPDATE existing rows, INSERT new rows), which should be supported by the primary key index, the only index on this table: UPDATE destination SET ... FROM staging WHERE staging.pk = destination.pk; INSERT INTO destination SELECT ... FROM staging WHERE NOT EXISTS(SELECT * FROM destination WHERE pk = staging.pk); In most cases, EXPLAIN and runtime tell me the index is utilized. However, sometime on back to back runs (5 minute intervals) the runtime explodes and EXPLAIN shows me that the PK index is not used, as both the staging table and destination table a sequentially scanned. The source data maintains the same pattern, and this load job is the only write activity in this particular database. A left anti-join in the above pseudo-query explains the same, and seems to make the problem occur more frequently. What could cause PostgreSQL (9.1) to stop using an index in a case like this, when it normally and expectedly uses it?
[GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok
when use insert command,we can insert 5.30e+01 to table,but when use copy command,we get this: ERROR: invalid input syntax for integer: '5.30e+01' CONTEXT: COPY tdsort_metrics, line 1, column monitor_value: '5.30e+01' any idea? Jov blog: http:amutu.com/blog http://amutu.com/blog
Re: [GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok
On 10/28/2013 11:34 PM, Jov wrote: when use insert command,we can insert 5.30e+01 to table,but when use copy command,we get this: ERROR: invalid input syntax for integer: '5.30e+01' CONTEXT: COPY tdsort_metrics, line 1, column monitor_value: '5.30e+01' any idea? that number is clearly a floating point number. INSERT takes expressions, and will cast the float to bigint, COPY doesn't and won't. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok
thanks,that make sense.changing the column type from bigint to double pecision solves the problem. Jov blog: http:amutu.com/blog http://amutu.com/blog 2013/10/29 John R Pierce pie...@hogranch.com On 10/28/2013 11:34 PM, Jov wrote: when use insert command,we can insert 5.30e+01 to table,but when use copy command,we get this: ERROR: invalid input syntax for integer: '5.30e+01' CONTEXT: COPY tdsort_metrics, line 1, column monitor_value: '5.30e+01' any idea? that number is clearly a floating point number. INSERT takes expressions, and will cast the float to bigint, COPY doesn't and won't. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] json datatype and table bloat?
On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.comwrote: I have a table that is triggering my nagios database bloat alert regularly. Usually, I have to give it the vacuum full more than once to get it under the threshold. Today I tried repeatedly and cannot get the alert to resolve. I had a discussion with one of the primary developers about how the table is utilized, and it turns out they are basically only ever inserting into it. This sort of flies in the face of conventional wisdom about bloat being caused by frequent updates and deletes. As I understand it, vacuuming only removes the tuples removed by delete and update operations. The question is how this is being inserted and if there is anything that ever updates the rows in any way. Maybe this is an manual process? But for 5x bloat, you have to have it be a repeated process. Maybe there was bad data that had to be corrected? We were looking at it, and one of the things that struck me is that this table has a column with a json datatype. I looked through information_schema.columns and there is only one other table with a json datatype, and I recall having bloat issues with this table in the past as well. I'm wondering if the json datatype is just naturally more bloated than other types, or if the query in the check_postgresql.pl nagios script is not accurate, or if maybe my thresholds are simply too low? Next time this happens it would be worth seeing what VACUUM FULL VERBOSE output is for that table. -- Best Wishes, Chris Travers Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in. http://www.efficito.com/learn_more.shtml
Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?
Matt bsg...@gmail.com writes: In most cases, EXPLAIN and runtime tell me the index is utilized. However, sometime on back to back runs (5 minute intervals) the runtime explodes and EXPLAIN shows me that the PK index is not used, as both the staging table and destination table a sequentially scanned. You haven't explained what you do with the staging table, but maybe you need to ANALYZE it after you've loaded fresh data into it. The stats for the main table are presumably somewhat stable, but the stats for the staging table maybe not. The apparent correlation to consecutive runs lends some support to this theory, as that would reduce the time window for auto-ANALYZE to fix it for you. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Increasing CPU usage of PostgreSQL
Rahila, if you want to saturate the CPU and don't care about the particular benchmark, try to use read-only transactions. Either just add -S at the pgbench command line, or write something SELECT-only on your own. Anyway, use '-j' in such cases. Thanks a lot for your suggestion. Using select only queries helped me lower % CPU time spent on IO. Also , increasing number of threads helped loading CPU around 87 to 90 %. -Rahila On Sat, Oct 26, 2013 at 7:53 PM, Tomas Vondra t...@fuzzy.cz wrote: On 25.10.2013 19:04, Scott Marlowe wrote: On Fri, Oct 25, 2013 at 8:29 AM, Rahila Syed rahilasye...@gmail.com wrote: Configurations of my machine is: Processors: Xeon E5-2650 Processor Kit Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM : 32GB DDR3-1600 REG Memory Kit 8x 4GB Registered ECC DIMM, DDR3L-1600(PC3L-12800) HDD: 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s Disk Speed : 10,000 RPM RAID Controller (512MB, RAID 0/1) My guess is that you're maxing out your IO subsystem long before you're maxing out CPU. What does iostat -xd 10 have to say about it? Right, that's my guess too. The problem is most likely the sync at the end of the transaction. Rahila, if you want to saturate the CPU and don't care about the particular benchmark, try to use read-only transactions. Either just add -S at the pgbench command line, or write something SELECT-only on your own. Anyway, use '-j' in such cases. Tomas -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
[GENERAL] Increasing CPU usage of PostgreSQL
iostat -xd 10 have to say about it? Yes i was indeed maxing out my IO subsystem. % CPU utilization for io is around 90 percent. I could not notice this earlier because i was firing CPU load queries from a remote machine hence network overhead was coming into picture .Thanks a lot for your suggestion. -Rahila On Fri, Oct 25, 2013 at 10:34 PM, Scott Marlowe scott.marl...@gmail.com wrote: On Fri, Oct 25, 2013 at 8:29 AM, Rahila Syed rahilasye...@gmail.com wrote: Hello All, I am using jdbcrunner-1.2 to run PostgreSQL performance tests. For certain tests, i need to increase the CPU usage of the servers especially at user level. I tried using both tpcc and tpcb load with scale factor of 100. Even after setting the number of client connections as high as 420, I am unable to achieve high CPU usage. It is hardly 3 percent. I think this is because most of the client connections are idle. The tpcc and tpcb transactions seem to be consuming very less CPU. In postgresql.conf file, I have lowered checkpoint segments to 1 in order to overwhelm the server with checkpoints but no significant increase in iowait of the CPU. Can somebody suggest a better idea to load PostgreSQL servers to increase CPU usage.? Configurations of my machine is: Processors: Xeon E5-2650 Processor Kit Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos RAM : 32GB DDR3-1600 REG Memory Kit 8x 4GB Registered ECC DIMM, DDR3L-1600(PC3L-12800) HDD: 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos 1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s Disk Speed : 10,000 RPM RAID Controller (512MB, RAID 0/1) My guess is that you're maxing out your IO subsystem long before you're maxing out CPU. What does iostat -xd 10 have to say about it? To understand recursion, one must first understand recursion. -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] json datatype and table bloat?
On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers chris.trav...@gmail.com wrote: On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.com wrote: I have a table that is triggering my nagios database bloat alert regularly. Usually, I have to give it the vacuum full more than once to get it under the threshold. Today I tried repeatedly and cannot get the alert to resolve. I had a discussion with one of the primary developers about how the table is utilized, and it turns out they are basically only ever inserting into it. This sort of flies in the face of conventional wisdom about bloat being caused by frequent updates and deletes. As I understand it, vacuuming only removes the tuples removed by delete and update operations. well, or by rolled back transactions. we we have to wonder if OP has a lot of queries trying to insert and failing. maybe check the log? merlin -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?
Since this is an upsert, the staging table simply accepts a copy of pipe-delimited data via a COPY FROM STDIN, and a subsequent SQL script performs the UPDATE/INSERT. The staging table is then truncated (or delete without where) for the next run: Truncate staging, COPY into staging, update then insert destination from staging, repeat on next 5 minute clock cycle. Since the staging table is fully loaded into the destination table, a full scan was not a concern, but the lack of index usage on the destination. For example, this is normally supported by the PK index, and performs well: INSERT INTO destination (…) SELECT (…) FROM staging WHERE NOT EXISTS (SELECT * FROM source WHERE source.id = destination.id) The above runs in expected time when the explain plan shows an index scan on the destination primary key index, and a seq scan on the staging table. This will continue for many runs, until something causes the engine to stop using the destination PK index, and begin scanning both the destination and staging. What triggers this is the interesting question.The only system event I can (loosely) correlate with the problem is the start of a nightly pg_dump, but as a read-only process, this would not be changing any index stats, correct? An ANALYZE on staging is possible after each load cycle, but time does not permit on the destination table. I have been focusing on the destination because it has the index that is not being used. Will stats on the staging table affect index selection on the destination in a case like this? In the process of attempting to diagnose this, both tables involved have been vacuumed (full), analyzed. I have also moved staging to an SSD volume, and created an equivalent index on staging - which is not used in any plan, nor do I expect to as there is no filter criteria on staging, and the index maintenance on staging would seem to be an unneeded overhead. But in this case, is there any advantage to an index on staging? For reference, it is possible (not enough test cycles to verify) that left anti-join makes this situation worse, even though the explain plans appear identical: INSERT INTO destination (…) SELECT (…) FROM staging LEFT JOIN destination ON destination.id = staging.id WHERE destination.id IS NULL On 29 Oct 2013, at 9:45, Tom Lane wrote: Matt bsg...@gmail.com writes: In most cases, EXPLAIN and runtime tell me the index is utilized. However, sometime on back to back runs (5 minute intervals) the runtime explodes and EXPLAIN shows me that the PK index is not used, as both the staging table and destination table a sequentially scanned. You haven't explained what you do with the staging table, but maybe you need to ANALYZE it after you've loaded fresh data into it. The stats for the main table are presumably somewhat stable, but the stats for the staging table maybe not. The apparent correlation to consecutive runs lends some support to this theory, as that would reduce the time window for auto-ANALYZE to fix it for you. regards, tom lane
Re: [GENERAL] Replication and fsync
Tomas Vondra t...@fuzzy.cz wrote: I think it's safe as long as you don't try to reuse the cluster after a crash (be it due to OS error, power outage, ...). If the primary crashes for any reasons, you have to start from scratch, otherwise there might be silent corruption as you've described. I agree. It seems to me that as long as you never try to start the old master after a crash of the OS (directly or because of hardware or VM failure), the standby should be usable without risk of corruption. As soon as you start the old master after such a crash though, you could be replicating corrupted blocks; you would need to be very hard-line about never bringing the old master back up. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] Cursor Example Needed
On Oct 28, 2013, at 6:50 PM, Adrian Klaver adrian.kla...@gmail.com wrote: On 10/28/2013 04:36 PM, Perry Smith wrote: On Oct 28, 2013, at 6:13 PM, John R Pierce pie...@hogranch.com wrote: On 10/28/2013 3:58 PM, Adrian Klaver wrote: The docs do a good job of illustrating: http://www.postgresql.org/docs/9.3/interactive/plpgsql-cursors.html thats for cursors created within a plpgsql function. I think what the OP wants is a top level cursor, which is a different thing... see http://www.postgresql.org/docs/current/static/sql-declare.html http://www.postgresql.org/docs/current/static/sql-fetch.html http://www.postgresql.org/docs/current/static/sql-close.html the fetch page shows an example of the complete usage in the context of a database transaction. Thank you to Merlin. I now understand better where my confusion was. John: Those examples are great except there is no way that I know of to loop on the top level as you call it. I'm trying to do something that I can give to psql which will loop through the entire set that is produced. The FETCH example shows you how. You do not have FOR but you do have FORWARD and if you DECLARE SCROLL, BACKWARD. If you need to do actions on each row as it is fetched then you will probably need to do it in a function. Your original post though was concerned with dealing with an out of memory error caused by returning to large a result set at one time and that can be handled in psql as illustrated. Yes. I finally understand your (or someone's) original reply. Thank you to all who helped me out. Perry signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [GENERAL] json datatype and table bloat?
Following up. I don't see any rolled back transactions in the logs. The part that troubles me really is that vacuum full doesn't actually fix the problem. If there were bad data that had been corrected via mass updates, I'd expect the bloat issue to be fixed by a vacuum full. When I run the vacuum back to back, this is what I get: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. VACUUM db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 0 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.09s/0.09u sec elapsed 0.32 sec. VACUUM I think the question to address may be: Why does the check_postgres query think there should only be 334 pages instead of 1754? The tbloat and wastedbytes calculations provided in the query from this page: http://wiki.postgresql.org/wiki/Show_database_bloat seems to correlate with the wasted bytes reported by nagios though. Greg Haase On Tue, Oct 29, 2013 at 7:06 AM, Merlin Moncure mmonc...@gmail.com wrote: On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers chris.trav...@gmail.com wrote: On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.com wrote: I have a table that is triggering my nagios database bloat alert regularly. Usually, I have to give it the vacuum full more than once to get it under the threshold. Today I tried repeatedly and cannot get the alert to resolve. I had a discussion with one of the primary developers about how the table is utilized, and it turns out they are basically only ever inserting into it. This sort of flies in the face of conventional wisdom about bloat being caused by frequent updates and deletes. As I understand it, vacuuming only removes the tuples removed by delete and update operations. well, or by rolled back transactions. we we have to wonder if OP has a lot of queries trying to insert and failing. maybe check the log? merlin
Re: [GENERAL] json datatype and table bloat?
On 10/29/2013 12:41 PM, Gregory Haase wrote: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] json datatype and table bloat?
So, between yesterday and today we actually failed over to our hot-standby instance and the issue hasn't changed. I don't think you can have a pending transaction across streaming replication. On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce pie...@hogranch.com wrote: On 10/29/2013 12:41 PM, Gregory Haase wrote: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] json datatype and table bloat?
One more thing I just tried: create table table_schema.table_name_new (like table_schema.table_name); insert into table_schema.table_name_new select * from table_schema.table_ name; The new tables shows the same amount of wasted bytes and pages as the old. So I think based on that I'm going to throw out any notion of updates or deletes as cause for bloat on this particular table. -G On Tue, Oct 29, 2013 at 12:53 PM, Gregory Haase haa...@onefreevoice.comwrote: So, between yesterday and today we actually failed over to our hot-standby instance and the issue hasn't changed. I don't think you can have a pending transaction across streaming replication. On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce pie...@hogranch.comwrote: On 10/29/2013 12:41 PM, Gregory Haase wrote: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction. -- john r pierce 37N 122W somewhere on the middle of the left coast
Re: [GENERAL] json datatype and table bloat?
John R Pierce pie...@hogranch.com writes: On 10/29/2013 12:41 PM, Gregory Haase wrote: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction. If old transactions were the problem, vacuum would be reporting that some-large-number of dead row versions couldn't be removed yet. There doesn't seem to be anything obviously wrong here. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] json datatype and table bloat?
I've isolated the problem to the json field not showing up in pg_stats, which affects the calculation of the avg row size in the bloat query. I'm not sure if this is a json issue or some other kind of issue. db_name=# select c.column_name, c.data_type from information_schema.columns c where table_name = 'table_name' and not exists (select 1 from pg_stats s where c.table_name = s.tablename and c.column_name = s.attname); column_name | data_type -+--- criteria| json (1 row) -G On Tue, Oct 29, 2013 at 1:51 PM, Tom Lane t...@sss.pgh.pa.us wrote: John R Pierce pie...@hogranch.com writes: On 10/29/2013 12:41 PM, Gregory Haase wrote: db_name=# VACUUM FULL VERBOSE table_schema.table_name; INFO: vacuuming table_schema.table_name INFO: table_name: found 2 removable, 29663 nonremovable row versions in 1754 pages DETAIL: 0 dead row versions cannot be removed yet. CPU 0.07s/0.10u sec elapsed 0.30 sec. is there an old transaction pending? that 'masks' vacuum from touching any tuples newer than the start of that transaction. If old transactions were the problem, vacuum would be reporting that some-large-number of dead row versions couldn't be removed yet. There doesn't seem to be anything obviously wrong here. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] json datatype and table bloat?
Gregory Haase haa...@onefreevoice.com writes: I've isolated the problem to the json field not showing up in pg_stats, which affects the calculation of the avg row size in the bloat query. I'm not sure if this is a json issue or some other kind of issue. Possibly your bloat query is failing to consider the toast table associated with this table? If the json values are large they'd mostly be in the toast table not the main table. (It's unfortunate that VACUUM FULL doesn't tell you about what's in the toast table. I'd try just VACUUM VERBOSE here, without the FULL, to get more info.) regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Re: [GENERAL] json datatype and table bloat?
Tom is correct: Vacuum verbose shows that their is an associated toast table. Neither the check_postgres.pl script or the query on http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this into consideration. Both rely on null_frac and avg_width from pg_stats to estimate how big the table should be. I'm not sure how you would factor the toast table into that estimate. -G On Tue, Oct 29, 2013 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote: Gregory Haase haa...@onefreevoice.com writes: I've isolated the problem to the json field not showing up in pg_stats, which affects the calculation of the avg row size in the bloat query. I'm not sure if this is a json issue or some other kind of issue. Possibly your bloat query is failing to consider the toast table associated with this table? If the json values are large they'd mostly be in the toast table not the main table. (It's unfortunate that VACUUM FULL doesn't tell you about what's in the toast table. I'd try just VACUUM VERBOSE here, without the FULL, to get more info.) regards, tom lane