[GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Matt
I have a relatively simple data load script, which upserts (UPDATE existing
rows, INSERT new rows), which should be supported by the primary key index,
the only index on this table:

UPDATE destination SET ... FROM staging WHERE staging.pk =
destination.pk;
INSERT INTO destination SELECT ... FROM staging WHERE NOT EXISTS(SELECT
* FROM destination WHERE pk  = staging.pk);

In most cases, EXPLAIN and runtime tell me the index is utilized. However,
sometime on back to back runs (5 minute intervals) the runtime explodes and
EXPLAIN shows me that the PK index is not used, as both the staging table
and destination table a sequentially scanned. The source data maintains the
same pattern, and this load job is the only write activity in this
particular database.

A left anti-join in the above pseudo-query explains the same, and seems to
make the problem occur more frequently.

What could cause PostgreSQL (9.1) to stop using an index in a case like
this, when it normally and expectedly uses it?


[GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok

2013-10-29 Thread Jov
when use insert command,we can insert 5.30e+01 to table,but when
use copy command,we get this:
ERROR:  invalid input syntax for integer: '5.30e+01'
CONTEXT:  COPY tdsort_metrics, line 1, column monitor_value:
'5.30e+01'

any idea?

Jov
blog: http:amutu.com/blog http://amutu.com/blog


Re: [GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok

2013-10-29 Thread John R Pierce

On 10/28/2013 11:34 PM, Jov wrote:
when use insert command,we can insert 5.30e+01 to table,but 
when use copy command,we get this:

ERROR:  invalid input syntax for integer: '5.30e+01'
CONTEXT:  COPY tdsort_metrics, line 1, column monitor_value: 
'5.30e+01'


any idea?


that number is clearly a floating point number.   INSERT takes 
expressions, and will cast the float to bigint, COPY doesn't and won't.



--
john r pierce  37N 122W
somewhere on the middle of the left coast



Re: [GENERAL] copy 5.3000000000e+01 to bigint column error while Insert ok

2013-10-29 Thread Jov
thanks,that make sense.changing the column type from bigint to double
pecision solves the problem.

Jov
blog: http:amutu.com/blog http://amutu.com/blog


2013/10/29 John R Pierce pie...@hogranch.com

  On 10/28/2013 11:34 PM, Jov wrote:

 when use insert command,we can insert 5.30e+01 to table,but when
 use copy command,we get this:
  ERROR:  invalid input syntax for integer: '5.30e+01'
 CONTEXT:  COPY tdsort_metrics, line 1, column monitor_value:
 '5.30e+01'

  any idea?


 that number is clearly a floating point number.   INSERT takes
 expressions, and will cast the float to bigint, COPY doesn't and won't.


 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast




Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Chris Travers
On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.comwrote:

 I have a table that is triggering my nagios database bloat alert
 regularly. Usually, I have to give it the vacuum full more than once to get
 it under the threshold. Today I tried repeatedly and cannot get the alert
 to resolve.

 I had a discussion with one of the primary developers about how the table
 is utilized, and it turns out they are basically only ever inserting into
 it. This sort of flies in the face of conventional wisdom about bloat being
 caused by frequent updates and deletes.


As I understand it, vacuuming only removes the tuples removed by delete and
update operations.

The question is how this is being inserted and if there is anything that
ever updates the rows in any way.  Maybe this is an manual process?  But
for 5x bloat, you have to have it be a repeated process.

Maybe there was bad data that had to be corrected?


 We were looking at it, and one of the things that struck me is that this
 table has a column with a json datatype. I looked through
 information_schema.columns and there is only one other table with a json
 datatype, and I recall having bloat issues with this table in the past as
 well.

 I'm wondering if the json datatype is just naturally more bloated than
 other types, or if the query in the check_postgresql.pl nagios script is
 not accurate, or if maybe my thresholds are simply too low?


Next time this happens it would be worth seeing what VACUUM FULL VERBOSE
output is for that table.

-- 
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor
lock-in.
http://www.efficito.com/learn_more.shtml


Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Tom Lane
Matt bsg...@gmail.com writes:
 In most cases, EXPLAIN and runtime tell me the index is utilized. However,
 sometime on back to back runs (5 minute intervals) the runtime explodes and
 EXPLAIN shows me that the PK index is not used, as both the staging table
 and destination table a sequentially scanned.

You haven't explained what you do with the staging table, but maybe you
need to ANALYZE it after you've loaded fresh data into it.  The stats for
the main table are presumably somewhat stable, but the stats for the
staging table maybe not.  The apparent correlation to consecutive runs
lends some support to this theory, as that would reduce the time window
for auto-ANALYZE to fix it for you.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Increasing CPU usage of PostgreSQL

2013-10-29 Thread Rahila Syed
Rahila, if you want to saturate the CPU and don't care about the
 particular benchmark, try to use read-only transactions. Either just add
 -S at the pgbench command line, or write something SELECT-only on your
 own. Anyway, use '-j' in such cases.


Thanks a lot for your suggestion. Using select only queries helped me
lower % CPU time spent on IO. Also , increasing number of threads
helped loading CPU around 87 to 90 %.

-Rahila


On Sat, Oct 26, 2013 at 7:53 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 25.10.2013 19:04, Scott Marlowe wrote:
 On Fri, Oct 25, 2013 at 8:29 AM, Rahila Syed rahilasye...@gmail.com wrote:

 Configurations of my machine is:

 Processors:   Xeon E5-2650 Processor Kit
   Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T,
 20 MB) * 2 nos


 RAM :  32GB DDR3-1600 REG Memory Kit
 8x 4GB Registered ECC DIMM, DDR3L-1600(PC3L-12800)

 HDD:  450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
  1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s

 Disk Speed  : 10,000 RPM

 RAID Controller (512MB, RAID 0/1)

 My guess is that you're maxing out your IO subsystem long before
 you're maxing out CPU. What does

 iostat -xd 10

 have to say about it?

 Right, that's my guess too. The problem is most likely the sync at the
 end of the transaction.

 Rahila, if you want to saturate the CPU and don't care about the
 particular benchmark, try to use read-only transactions. Either just add
 -S at the pgbench command line, or write something SELECT-only on your
 own. Anyway, use '-j' in such cases.

 Tomas


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Increasing CPU usage of PostgreSQL

2013-10-29 Thread Rahila Syed
 iostat -xd 10

 have to say about it?

Yes i was indeed maxing out my IO subsystem.  % CPU utilization for io
is around 90 percent. I could not notice this earlier because i was
firing CPU load queries from a remote machine hence network overhead
was coming into picture  .Thanks a lot for your suggestion.

-Rahila

On Fri, Oct 25, 2013 at 10:34 PM, Scott Marlowe scott.marl...@gmail.com wrote:
 On Fri, Oct 25, 2013 at 8:29 AM, Rahila Syed rahilasye...@gmail.com wrote:
 Hello All,

 I am using jdbcrunner-1.2 to run PostgreSQL performance tests. For
 certain tests, i need to increase the CPU usage of the servers
 especially at user level.
 I tried using both tpcc and tpcb load with scale factor of 100. Even
 after setting the number of client connections as high as 420, I am
 unable to achieve high CPU usage. It is hardly 3 percent. I think this
 is because most of the client connections are idle.
 The tpcc and tpcb transactions seem to be consuming very less CPU.

 In postgresql.conf file, I have lowered checkpoint segments to 1 in
 order to overwhelm the server with checkpoints but no significant
 increase in iowait of the CPU.

 Can somebody suggest a better idea to load PostgreSQL servers to
 increase CPU usage.?

 Configurations of my machine is:

 Processors:   Xeon E5-2650 Processor Kit
   Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T,
 20 MB) * 2 nos


 RAM :  32GB DDR3-1600 REG Memory Kit
 8x 4GB Registered ECC DIMM, DDR3L-1600(PC3L-12800)

 HDD:  450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
  1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s

 Disk Speed  : 10,000 RPM

 RAID Controller (512MB, RAID 0/1)

 My guess is that you're maxing out your IO subsystem long before
 you're maxing out CPU. What does

 iostat -xd 10

 have to say about it?

 To understand recursion, one must first understand recursion.


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Merlin Moncure
On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers chris.trav...@gmail.com wrote:



 On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.com
 wrote:

 I have a table that is triggering my nagios database bloat alert
 regularly. Usually, I have to give it the vacuum full more than once to get
 it under the threshold. Today I tried repeatedly and cannot get the alert to
 resolve.

 I had a discussion with one of the primary developers about how the table
 is utilized, and it turns out they are basically only ever inserting into
 it. This sort of flies in the face of conventional wisdom about bloat being
 caused by frequent updates and deletes.


 As I understand it, vacuuming only removes the tuples removed by delete and
 update operations.

well, or by rolled back transactions.  we we have to wonder if OP has
a lot of queries trying to insert and failing.  maybe check the log?

merlin


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] INSERT/UPDATE statements sometimes choosing not to use PK index?

2013-10-29 Thread Matt
Since this is an upsert, the staging table simply accepts a copy of
pipe-delimited data via a COPY FROM STDIN, and a subsequent SQL script
performs the UPDATE/INSERT. The staging table is then truncated (or delete
without where) for the next run: Truncate staging, COPY into staging,
update then insert destination from staging, repeat on next 5 minute clock
cycle.

Since the staging table is fully loaded into the destination table, a full
scan was not a concern, but the lack of index usage on the destination. For
example, this is normally supported by the PK index, and performs well:

INSERT INTO destination (…)
SELECT (…) FROM staging
WHERE NOT EXISTS (SELECT * FROM source WHERE source.id = destination.id)

The above runs in expected time when the explain plan shows an index scan
on the destination primary key index, and a seq scan on the staging table.
This will continue for many runs, until something causes the engine to stop
using the destination PK index, and begin scanning both the destination and
staging. What triggers this is the interesting question.The only system
event I can (loosely) correlate with the problem is the start of a nightly
pg_dump, but as a read-only process, this would not be changing any index
stats, correct?

An ANALYZE on staging is possible after each load cycle, but time does not
permit on the destination table. I have been focusing on the destination
because it has the index that is not being used. Will stats on the staging
table affect index selection on the destination in a case like this?

In the process of attempting to diagnose this, both tables involved have
been vacuumed (full), analyzed. I have also moved staging to an SSD volume,
and created an equivalent index on staging - which is not used in any plan,
nor do I expect to as there is no filter criteria on staging, and the index
maintenance on staging would seem to be an unneeded overhead. But in this
case, is there any advantage to an index on staging?

For reference, it is possible (not enough test cycles to verify) that left
anti-join makes this situation worse, even though the explain plans appear
identical:

INSERT INTO destination (…)
SELECT (…) FROM staging
LEFT JOIN destination ON destination.id = staging.id
WHERE destination.id IS NULL


On 29 Oct 2013, at 9:45, Tom Lane wrote:

 Matt bsg...@gmail.com writes:
 In most cases, EXPLAIN and runtime tell me the index is utilized.
However,
 sometime on back to back runs (5 minute intervals) the runtime explodes
and
 EXPLAIN shows me that the PK index is not used, as both the staging table
 and destination table a sequentially scanned.

 You haven't explained what you do with the staging table, but maybe you
 need to ANALYZE it after you've loaded fresh data into it.  The stats for
 the main table are presumably somewhat stable, but the stats for the
 staging table maybe not.  The apparent correlation to consecutive runs
 lends some support to this theory, as that would reduce the time window
 for auto-ANALYZE to fix it for you.

 regards, tom lane


Re: [GENERAL] Replication and fsync

2013-10-29 Thread Kevin Grittner
Tomas Vondra t...@fuzzy.cz wrote:

 I think it's safe as long as you don't try to reuse the cluster
 after a crash (be it due to OS error, power outage, ...). If the
 primary crashes for any reasons, you have to start from scratch,
 otherwise there might be silent corruption as you've described.

I agree.  It seems to me that as long as you never try to start the
old master after a crash of the OS (directly or because of hardware
or VM failure), the standby should be usable without risk of
corruption.  As soon as you start the old master after such a crash
though, you could be replicating corrupted blocks; you would need
to be very hard-line about never bringing the old master back up.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Cursor Example Needed

2013-10-29 Thread Perry Smith

On Oct 28, 2013, at 6:50 PM, Adrian Klaver adrian.kla...@gmail.com wrote:

 On 10/28/2013 04:36 PM, Perry Smith wrote:
 
 On Oct 28, 2013, at 6:13 PM, John R Pierce pie...@hogranch.com wrote:
 
 On 10/28/2013 3:58 PM, Adrian Klaver wrote:
 The docs do a good job of illustrating:
 
 http://www.postgresql.org/docs/9.3/interactive/plpgsql-cursors.html
 
 thats for cursors created within a plpgsql function.
 
 I think what the OP wants is a top level cursor, which is a different 
 thing...
 
 see
 http://www.postgresql.org/docs/current/static/sql-declare.html
 http://www.postgresql.org/docs/current/static/sql-fetch.html
 http://www.postgresql.org/docs/current/static/sql-close.html
 
 the fetch page shows an example of the complete usage in the context of a 
 database transaction.
 
 Thank you to Merlin.  I now understand better where my confusion was.
 
 John:
 
 Those examples are great except there is no way that I know of to loop on
 the top level as you call it.  I'm trying to do something that I can give 
 to
 psql which will loop through the entire set that is produced.
 
 The FETCH example shows you how. You do not have FOR but you do have FORWARD 
 and if you DECLARE SCROLL, BACKWARD. If you need to do actions on each row as 
 it is fetched then you will probably need to do it in a function. Your 
 original post though was concerned with dealing with an out of memory error 
 caused by returning to large a result set at one time and that can be handled 
 in psql as illustrated.

Yes.  I finally understand your (or someone's) original reply.

Thank you to all who helped me out.

Perry




signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
Following up.

I don't see any rolled back transactions in the logs.

The part that troubles me really is that vacuum full doesn't actually fix
the problem. If there were bad data that had been corrected via mass
updates, I'd expect the bloat issue to be fixed by a vacuum full.

When I run the vacuum back to back, this is what I get:

db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO:  vacuuming table_schema.table_name
INFO:  table_name: found 2 removable, 29663 nonremovable row versions in
1754 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.07s/0.10u sec elapsed 0.30 sec.
VACUUM
db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO:  vacuuming table_schema.table_name
INFO:  table_name: found 0 removable, 29663 nonremovable row versions in
1754 pages
DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.09s/0.09u sec elapsed 0.32 sec.
VACUUM

I think the question to address may be: Why does the check_postgres query
think there should only be 334 pages instead of 1754?

The tbloat and wastedbytes calculations provided in the query from this
page: http://wiki.postgresql.org/wiki/Show_database_bloat seems to
correlate with the wasted bytes reported by nagios though.

Greg Haase




On Tue, Oct 29, 2013 at 7:06 AM, Merlin Moncure mmonc...@gmail.com wrote:

 On Tue, Oct 29, 2013 at 5:38 AM, Chris Travers chris.trav...@gmail.com
 wrote:
 
 
 
  On Mon, Oct 28, 2013 at 4:17 PM, Gregory Haase haa...@onefreevoice.com
  wrote:
 
  I have a table that is triggering my nagios database bloat alert
  regularly. Usually, I have to give it the vacuum full more than once to
 get
  it under the threshold. Today I tried repeatedly and cannot get the
 alert to
  resolve.
 
  I had a discussion with one of the primary developers about how the
 table
  is utilized, and it turns out they are basically only ever inserting
 into
  it. This sort of flies in the face of conventional wisdom about bloat
 being
  caused by frequent updates and deletes.
 
 
  As I understand it, vacuuming only removes the tuples removed by delete
 and
  update operations.

 well, or by rolled back transactions.  we we have to wonder if OP has
 a lot of queries trying to insert and failing.  maybe check the log?

 merlin



Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread John R Pierce

On 10/29/2013 12:41 PM, Gregory Haase wrote:

db_name=# VACUUM FULL VERBOSE table_schema.table_name;
INFO:  vacuuming table_schema.table_name
INFO:  table_name: found 2 removable, 29663 nonremovable row 
versions in 1754 pages

DETAIL:  0 dead row versions cannot be removed yet.
CPU 0.07s/0.10u sec elapsed 0.30 sec.


is there an old transaction pending?   that 'masks' vacuum from touching 
any tuples newer than the start of that transaction.




--
john r pierce  37N 122W
somewhere on the middle of the left coast



Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
So, between yesterday and today we actually failed over to our hot-standby
instance and the issue hasn't changed. I don't think you can have a pending
transaction across streaming replication.


On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce pie...@hogranch.com wrote:

  On 10/29/2013 12:41 PM, Gregory Haase wrote:

 db_name=# VACUUM FULL VERBOSE table_schema.table_name;
 INFO:  vacuuming table_schema.table_name
 INFO:  table_name: found 2 removable, 29663 nonremovable row versions
 in 1754 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 CPU 0.07s/0.10u sec elapsed 0.30 sec.


 is there an old transaction pending?   that 'masks' vacuum from touching
 any tuples newer than the start of that transaction.



 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast




Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
One more thing I just tried:

create table table_schema.table_name_new (like table_schema.table_name);
insert into table_schema.table_name_new select * from table_schema.table_
name;

The new tables shows the same amount of wasted bytes and pages as the old.

So I think based on that I'm going to throw out any notion of updates or
deletes as cause for bloat on this particular table.

-G


On Tue, Oct 29, 2013 at 12:53 PM, Gregory Haase haa...@onefreevoice.comwrote:

 So, between yesterday and today we actually failed over to our hot-standby
 instance and the issue hasn't changed. I don't think you can have a pending
 transaction across streaming replication.


 On Tue, Oct 29, 2013 at 12:49 PM, John R Pierce pie...@hogranch.comwrote:

  On 10/29/2013 12:41 PM, Gregory Haase wrote:

 db_name=# VACUUM FULL VERBOSE table_schema.table_name;
 INFO:  vacuuming table_schema.table_name
 INFO:  table_name: found 2 removable, 29663 nonremovable row versions
 in 1754 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 CPU 0.07s/0.10u sec elapsed 0.30 sec.


 is there an old transaction pending?   that 'masks' vacuum from touching
 any tuples newer than the start of that transaction.



 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast





Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Tom Lane
John R Pierce pie...@hogranch.com writes:
 On 10/29/2013 12:41 PM, Gregory Haase wrote:
 db_name=# VACUUM FULL VERBOSE table_schema.table_name;
 INFO:  vacuuming table_schema.table_name
 INFO:  table_name: found 2 removable, 29663 nonremovable row 
 versions in 1754 pages
 DETAIL:  0 dead row versions cannot be removed yet.
 CPU 0.07s/0.10u sec elapsed 0.30 sec.

 is there an old transaction pending?   that 'masks' vacuum from touching 
 any tuples newer than the start of that transaction.

If old transactions were the problem, vacuum would be reporting that
some-large-number of dead row versions couldn't be removed yet.

There doesn't seem to be anything obviously wrong here.

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
I've isolated the problem to the json field not showing up in pg_stats,
which affects the calculation of the avg row size in the bloat query.

I'm not sure if this is a json issue or some other kind of issue.

db_name=# select c.column_name, c.data_type from information_schema.columns
c where table_name = 'table_name' and not exists (select 1 from pg_stats s
where c.table_name = s.tablename and c.column_name = s.attname);
 column_name | data_type
-+---
 criteria| json
(1 row)

-G


On Tue, Oct 29, 2013 at 1:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 John R Pierce pie...@hogranch.com writes:
  On 10/29/2013 12:41 PM, Gregory Haase wrote:
  db_name=# VACUUM FULL VERBOSE table_schema.table_name;
  INFO:  vacuuming table_schema.table_name
  INFO:  table_name: found 2 removable, 29663 nonremovable row
  versions in 1754 pages
  DETAIL:  0 dead row versions cannot be removed yet.
  CPU 0.07s/0.10u sec elapsed 0.30 sec.

  is there an old transaction pending?   that 'masks' vacuum from touching
  any tuples newer than the start of that transaction.

 If old transactions were the problem, vacuum would be reporting that
 some-large-number of dead row versions couldn't be removed yet.

 There doesn't seem to be anything obviously wrong here.

 regards, tom lane


 --
 Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-general



Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Tom Lane
Gregory Haase haa...@onefreevoice.com writes:
 I've isolated the problem to the json field not showing up in pg_stats,
 which affects the calculation of the avg row size in the bloat query.

 I'm not sure if this is a json issue or some other kind of issue.

Possibly your bloat query is failing to consider the toast table
associated with this table?  If the json values are large they'd
mostly be in the toast table not the main table.

(It's unfortunate that VACUUM FULL doesn't tell you about what's
in the toast table.  I'd try just VACUUM VERBOSE here, without the
FULL, to get more info.)

regards, tom lane


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] json datatype and table bloat?

2013-10-29 Thread Gregory Haase
Tom is correct: Vacuum verbose shows that their is an associated toast
table. Neither the check_postgres.pl script or the query on
http://wiki.postgresql.org/wiki/Show_database_bloat appear to take this
into consideration. Both rely on null_frac and avg_width from pg_stats to
estimate how big the table should be. I'm not sure how you would factor the
toast table into that estimate.

-G


On Tue, Oct 29, 2013 at 2:05 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Gregory Haase haa...@onefreevoice.com writes:
  I've isolated the problem to the json field not showing up in pg_stats,
  which affects the calculation of the avg row size in the bloat query.

  I'm not sure if this is a json issue or some other kind of issue.

 Possibly your bloat query is failing to consider the toast table
 associated with this table?  If the json values are large they'd
 mostly be in the toast table not the main table.

 (It's unfortunate that VACUUM FULL doesn't tell you about what's
 in the toast table.  I'd try just VACUUM VERBOSE here, without the
 FULL, to get more info.)

 regards, tom lane