[PERFORM] WAL partition filling up after high WAL activity

2011-11-07 Thread Richard Yen
Hi Everyone,

I recently saw a crash on one of our databases, and I was wondering if this
might be an indication of something with WAL that might be unexpectedly
creating more files than it needs to?

Nov  5 16:18:27 localhost postgres[25092]: [111-1] 2011-11-05 16:18:27.524
PDT [user=slony,db=uk dbhub.com http://dbhub.sac.iparadigms.com/(35180)
PID:25092 XID:2142895751]PANIC:  could not write to file
pg_xlog/xlogtemp.25092: No space left on device
Nov  5 16:18:27 localhost postgres[25092]: [111-2] 2011-11-05 16:18:27.524
PDT [user=slony,db=uk dbhub.com http://dbhub.sac.iparadigms.com/(35180)
PID:25092 XID:2142895751]STATEMENT:  select _sac_uk.forwardConfirm(2, 4, '
5003717188', '2011-11-05 16:18:26.977112');
Nov  5 16:18:27 localhost postgres[32121]: [7-1] 2011-11-05 16:18:27.531
PDT [user=,db=  PID:32121 XID:0]LOG:  server process (PID 25092) was
terminated by signal 6: Aborted
Nov  5 16:18:27 localhost postgres[32121]: [8-1] 2011-11-05 16:18:27.531
PDT [user=,db=  PID:32121 XID:0]LOG:  terminating any other active server
processes

If you look at this graph (http://cl.ly/2y0W27330t3o2J281H3K), the
partition actually fills up, and the logs show that postgres crashed.
 After postgres crashed, it automatically restarted, cleared out its WAL
files, and began processing things again at 4:30PM.

From the graph, it looks like a vacuum on table m_dg_read finished at
4:08PM, which might explain why the downward slope levels off for a few
minutes:

 Nov  5 16:08:03 localhost postgres[18741]: [20-1] 2011-11-05 16:08:03.400
 PDT [user=,db=  PID:18741 XID:0]LOG:  automatic vacuum of table
 uk.public.m_dg_read: index scans: 1
 Nov  5 16:08:03 localhost postgres[18741]: [20-2]   pages: 0 removed,
 65356 remain
 Nov  5 16:08:03 localhost postgres[18741]: [20-3]   tuples: 31770
 removed, 1394263 remain
 Nov  5 16:08:03 localhost postgres[18741]: [20-4]   system usage: CPU
 2.08s/5.35u sec elapsed 619.39 sec


Looks like right afterwards, it got started on table m_object, which
finished at 4:18PM:

 Nov  5 16:18:19 localhost postgres[18686]: [9-1] 2011-11-05 16:18:19.448
 PDT [user=,db=  PID:18686 XID:0]LOG:  automatic vacuum of table
 uk.public.m_object: index scans: 1
 Nov  5 16:18:19 localhost postgres[18686]: [9-2]pages: 0 removed,
 152862 remain
 Nov  5 16:18:19 localhost postgres[18686]: [9-3]tuples: 17084
 removed, 12455761 remain
 Nov  5 16:18:19 localhost postgres[18686]: [9-4]system usage: CPU
 4.55s/15.09u sec elapsed 1319.98 sec


It could very well be the case that upon the finish of m_object's vacuum,
another vacuum was beginning, and it eventually just crashed because there
was no room for another vacuum to finish.

We encountered a situation like this last summer on 7/4/2010 for a
different database cluster -- a big vacuum-for-wraparound on at 15GB table
filled the pg_xlog partition--this is how we started monitoring the pg_xlog
file size and wraparound countdown.  Seems like there was some sort of
vacuum-for-wraparound process happening during the time of this crash, as
we also track the XID to see when we should expect a VACUUM FREEZE (
http://cl.ly/3s1S373I0l0v3E171Z0V).

Some configs:
checkpoint_segments=16
wal_buffers=8MB
#archive_mode=off
checkpoint_completion_target=0.9

Postgres version is 8.4.5

Note also that the pg_xlog partition is 9.7GB.  No other apps run on the
machine besides pgbouncer, so it's highly unlikely that files are written
to this partition by another process.  Also, our five largest tables are
the following:
gm3_load_times  | 2231 MB
m_object_paper  | 1692 MB
m_object  | 1192 MB
m_report_stats  | 911 MB
gm3_mark  | 891 MB

My biggest question is: we know from the docs that there should be no more
than (2 + checkpoint_completion_target) * checkpoint_segments + 1 files.
 For us, that would mean no more than 48 files, which equates to 384MB--far
lower than the 9.7GB partition size.  **Why would WAL use up so much disk
space?**

Thanks for reading, and thanks in advance for any help you may provide.
--Richard


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-08 Thread Richard Yen

On Jul 8, 2010, at 12:04 PM, Kevin Grittner wrote:

 Robert Haas robertmh...@gmail.com wrote:
 
 I don't understand how you managed to fill up 37GB of disk with
 WAL files.  Every time you fill up checkpoint_segments * 16MB of
 WAL files, you ought to get a checkpoint.  When it's complete, WAL
 segments completely written before the start of the checkpoint
 should be recyclable.  Unless I'm confused, which apparently I am.
 
 You're not alone.  At first I was assuming that it was because of
 archiving, but the OP says that's turned off.  Unless it had been on
 and there wasn't a *restart* after changing the configuration, I
 can't see how this could happen, and was hoping someone else could
 cast some light on the issue.

I'm fairly confused myself.  I'm beginning to think that because data/base and 
data/pg_xlog were on the same partition (/db), when the /db partition filled, 
up the WAL files couldn't get flushed to data/base, thereby preventing 
data/pg_xlog from being emptied out, as per the documentation.

My concern is that--as in the original post--there were moments where 129 WAL 
files were generated in one minute.  Is it plausible that this autovacuum could 
be responsible for this?

--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-08 Thread Richard Yen

On Jul 8, 2010, at 12:27 PM, Tom Lane wrote:
 
 (Hmm ... but those complaints are logged at level WARNING, which as
 discussed elsewhere is really lower than LOG.  Should we change them?)

Hmm, I did a grep on WARNING on my log, and the only thing that turns up are 
the WARNING:  terminating connection because of crash of another server 
process entries when postgres died and shut down when the disks filled up.

Would this be conclusive evidence that archive_mode=off?

--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-08 Thread Richard Yen

On Jul 8, 2010, at 12:50 PM, Tom Lane wrote:

 Richard Yen rich...@iparadigms.com writes:
 My concern is that--as in the original post--there were moments where 129 
 WAL files were generated in one minute.  Is it plausible that this 
 autovacuum could be responsible for this?
 
 That's not a particularly surprising WAL creation rate for a busy
 database.  I wouldn't expect autovacuum to cause it by itself, but
 that's true only because autovacuum processing is typically throttled
 by autovacuum_vacuum_cost_delay.  Perhaps you had that set to zero?
 

Ah, yes, autovacuum_vacuum_cost_delay = 0 in my config.  That explains 
it--guess I'm playing with knives if I set things that way...

--Richard


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-08 Thread Richard Yen

On Jul 8, 2010, at 12:50 PM, Kevin Grittner wrote:

 Richard Yen rich...@iparadigms.com wrote:
 
 there were moments where 129 WAL files were generated in one
 minute.  Is it plausible that this autovacuum could be responsible
 for this?
 
 I don't remember seeing your autovacuum and vacuum config settings,
 or an answer to my question about whether there was a bulk load of a
 significant portion of current data.  With agressive autovacuum
 settings, hitting the freeze threshold for bulk-loaded data could do
 that.
 

Yeah, autovacuum is pretty aggressive, as I recall:
autovacuum = on
log_autovacuum_min_duration = 0
autovacuum_max_workers = 8
autovacuum_naptime = 1min
autovacuum_vacuum_threshold = 400
autovacuum_analyze_threshold = 200
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_freeze_max_age = 2
autovacuum_vacuum_cost_delay = 0
autovacuum_vacuum_cost_limit = -1

vacuum_cost_delay = 0
vacuum_cost_limit = 200

When you say bulk-loaded, I suppose that also includes loading the data via 
slony ADD NODE as well--correct?  I created this node maybe 6 months ago via 
slony ADD NODE

Thanks much for your time,
--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] [Slony1-general] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Richard Yen
Sorry, I forgot to mention that archive_mode is off and commented out, and 
archive command is '' and commented out. 

Thanks for following up!
-- Richard

On Jul 7, 2010, at 1:58, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote:

 On 07/07/10 13:10, Richard Yen wrote:
 
 This leads me to believe that there was a sudden flurry of write activity 
 that occurred, and the process that would flush WAL files to /db/data/ 
 couldn't keep up, thereby filling up the disk.  I'm wondering if anyone else 
 out there might be able to give me some insight or comments to my 
 assessment--is it accurate?  Any input would be helpful, and I'll try to 
 make necessary architectural changes to keep this from happening again.
 
 
 Do you have wal archiving enabled? (if so lets see your archive_command).
 
 Cheers
 
 Mark
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [Slony1-general] [PERFORM] WAL partition overloaded--by autovacuum?

2010-07-07 Thread Richard Yen
On Jul 6, 2010, at 8:25 PM, Scott Marlowe wrote:

 Tell us what you can about your hardware setup.

Sorry, I made the bad assumption that the hardware setup would be 
irrelevant--dunno why I thought that.

My hardware setup is 2 FusionIO 160GB drives in a RAID-1 configuration, running 
on an HP DL360 G5

I think I figured out the problem:

-- I figured that pg_xlog and data/base could both be on the FusionIO drive, 
since there would be no latency when there are no spindles.
-- However, I didn't take into account the fact that pg_xlog might grow in size 
when autovacuum does its work when vacuuming to prevent XID wraparound.  I 
*just* discovered this when one of my other replication nodes decided to die on 
me and fill up its disk.
-- Unfortunately, my db is 114GB (including indexes) or 60GB (without indexes), 
leaving ~37GB for pg_xlog (since they are sharing a partition).  So I'm 
guessing what happened was that when autovacuum ran to prevent XID wraparound, 
it takes each table and changes the XID, and it gets recorded in WAL, causing 
WAL to bloat.  This this the correct understanding?

Question for now is, documentation says:
 There will always be at least one WAL segment file, and will normally not be 
 more than (2 + checkpoint_completion_target) * checkpoint_segments + 1 files. 
 Each segment file is normally 16 MB (though this size can be altered when 
 building the server). You can use this to estimate space requirements for 
 WAL. Ordinarily, when old log segment files are no longer needed, they are 
 recycled (renamed to become the next segments in the numbered sequence). If, 
 due to a short-term peak of log output rate, there are more than 3 * 
 checkpoint_segments + 1 segment files, the unneeded segment files will be 
 deleted instead of recycled until the system gets back under this limit.

This means my pg_xlog partition should be (2 + checkpoint_completion_target) * 
checkpoint_segments + 1 = 41 files, or 656MB.  Then, if there are more than 49 
files, unneeded segment files will be deleted, but in this case all segment 
files are needed, so they never got deleted.  Perhaps we should add in the docs 
that pg_xlog should be the size of the DB or larger?

--Richard




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] WAL partition overloaded--by autovacuum?

2010-07-06 Thread Richard Yen
Hi everyone,

I'm running 8.4.2 on a CentOS machine, and postgres recently died with signal 6 
because the pg_xlog partition filled up (33GB) on 7/4/10 10:34:23 (perfect 
timing, as I was hiking in the mountains in the remotest parts of our country). 
 I did some digging and found the following:

-- The current state of /db/data/pg_xlog (after postgres died on 7/4/10) 
indicates there are 2056 files, all created between 7/4/10 10:01 and 7/4/10 
10:34, and a quick bash-kung-fu check shows that these files were created 
anywhere from 3 per minute (10:22) to 106 per minute (10:12)
-- wal_buffers=8MB; commit_siblings=200; commit_delay=0; checkpoint_segments=16
-- All other WAL-related parameters are either defaults or commented out

-- syslog shows that on 7/2/10 from 10:16:11 to 10:21:51, messages similar to 
this occurred 29 times:
 Jul  2 10:16:11 db4.sac postgres[20526]: [56-1] 2010-07-02 10:16:11.117 PDT 
 [user=,db=  PID:20526 XID:0]LOG:  checkpoints are occurring too frequently (4 
 seconds apart)
 Jul  2 10:16:11 db4.sac postgres[20526]: [56-2] 2010-07-02 10:16:11.117 PDT 
 [user=,db=  PID:20526 XID:0]HINT:  Consider increasing the configuration 
 parameter checkpoint_segments.

-- On 7/4/10 from 09:09:02 to 09:10:08, the same type of messages occurred 21 
times in syslog
-- This message did not occur on 7/3/10 at all

-- Preceding the checkpoints are occurring too frequently syslog entries are 
autovacuum entries:
 Jul  2 10:16:04 db4.sac postgres[11357]: [7-1] 2010-07-02 10:16:04.576 PDT 
 [user=,db=  PID:11357 XID:0]LOG:  automatic vacuum of table 
 tii._sac_global.sl_event: index scans: 1
 Jul  2 10:16:04 db4.sac postgres[11357]: [7-2] pages: 0 removed, 25 remain
 Jul  2 10:16:04 db4.sac postgres[11357]: [7-3] tuples: 676 removed, 955 
 remain
 Jul  2 10:16:04 db4.sac postgres[11357]: [7-4] system usage: CPU 
 0.01s/0.00u sec elapsed 0.02 sec
 Jul  2 10:16:04 db4.sac postgres[11357]: [8-1] 2010-07-02 10:16:04.580 PDT 
 [user=,db=  PID:11357 XID:197431667]LOG:  automatic analyze of table 
 tii._sac_global.sl_event system usage: CPU 0.00s/0.00u sec elapsed 0.00 sec
 Jul  2 10:16:04 db4.sac postgres[11357]: [9-1] 2010-07-02 10:16:04.965 PDT 
 [user=,db=  PID:11357 XID:0]LOG:  automatic vacuum of table 
 tii._sac_global.sl_confirm: index scans: 1
 Jul  2 10:16:04 db4.sac postgres[11357]: [9-2] pages: 0 removed, 154 
 remain
 Jul  2 10:16:04 db4.sac postgres[11357]: [9-3] tuples: 1834 removed, 8385 
 remain
 Jul  2 10:16:04 db4.sac postgres[11357]: [9-4] system usage: CPU 
 0.32s/0.04u sec elapsed 0.37 sec

 Jul  4 09:08:56 db4.sac postgres[21798]: [13-1] 2010-07-04 09:08:56.107 PDT 
 [user=,db=  PID:21798 XID:0]LOG:  automatic vacuum of table 
 tii.public.city: index scans: 0
 Jul  4 09:08:56 db4.sac postgres[21798]: [13-2]pages: 0 removed, 151 
 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [13-3]tuples: 0 removed, 20223 
 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [13-4]system usage: CPU 
 0.00s/0.00u sec elapsed 0.01 sec
 Jul  4 09:08:56 db4.sac postgres[21798]: [14-1] 2010-07-04 09:08:56.118 PDT 
 [user=,db=  PID:21798 XID:0]LOG:  automatic vacuum of table 
 tii.public.country: index scans: 1
 Jul  4 09:08:56 db4.sac postgres[21798]: [14-2]pages: 0 removed, 2 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [14-3]tuples: 77 removed, 185 
 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [14-4]system usage: CPU 
 0.00s/0.00u sec elapsed 0.00 sec
 Jul  4 09:08:56 db4.sac postgres[21798]: [15-1] 2010-07-04 09:08:56.335 PDT 
 [user=,db=  PID:21798 XID:0]LOG:  automatic vacuum of table 
 tii.public.gm3_clipboard: index scans: 1
 Jul  4 09:08:56 db4.sac postgres[21798]: [15-2]pages: 0 removed, 897 
 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [15-3]tuples: 2 removed, 121594 
 remain
 Jul  4 09:08:56 db4.sac postgres[21798]: [15-4]system usage: CPU 
 0.07s/0.08u sec elapsed 0.20 sec
...snip...
 Jul  4 09:10:25 db4.sac postgres[22066]: [23-1] 2010-07-04 09:10:25.921 PDT 
 [user=,db=  PID:22066 XID:0]LOG:  automatic vacuum of table 
 tii.public.pm_assignment: index scans: 1
 Jul  4 09:10:25 db4.sac postgres[22066]: [23-2]pages: 0 removed, 995 
 remain
 Jul  4 09:10:25 db4.sac postgres[22066]: [23-3]tuples: 323 removed, 83964 
 remain
 Jul  4 09:10:25 db4.sac postgres[22066]: [23-4]system usage: CPU 
 0.01s/0.09u sec elapsed 0.52 sec
 Jul  4 09:10:25 db4.sac postgres[22073]: [7-1] 2010-07-04 09:10:25.978 PDT 
 [user=,db=  PID:22073 XID:0]LOG:  automatic vacuum of table 
 tii.public.pm_question_type: index scans: 0
 Jul  4 09:10:25 db4.sac postgres[22073]: [7-2] pages: 0 removed, 1 remain
 Jul  4 09:10:25 db4.sac postgres[22073]: [7-3] tuples: 0 removed, 2 remain
 Jul  4 09:10:25 db4.sac postgres[22073]: [7-4] system usage: CPU 
 0.00s/0.00u sec elapsed 0.00 sec
 Jul  4 09:10:26 db4.sac postgres[22066]: [24-1] 2010-07-04 09:10:26.301 PDT 
 [user=,db=  PID:22066 XID:0]LOG:  automatic vacuum of table 
 

[PERFORM] prepared query performs much worse than regular query

2010-05-21 Thread Richard Yen
Hi everyone,

I use DBD::Pg to interface with our 8.4.2 database, but for a particular query, 
performance is horrible.  I'm assuming that the behavior of $dbh-prepare is as 
if I did PREPARE foo AS (query), so I did an explain analyze in the commandline:
 db_alpha=# prepare foo6 as (SELECT me.id, me.assignment, me.title, 
 me.x_firstname, me.x_lastname, me.owner, me.node, me.grade, me.folder, 
 me.word_count, me.char_length, me.char_count, me.page_count FROM submissions 
 me WHERE ( ( owner = $1 AND me.assignment = $2 ) ));
 PREPARE
 db_alpha=# explain analyze execute foo6('-1', '8996557');
 QUERY 
 PLAN 
 ---
  Bitmap Heap Scan on submissions me  (cost=38.84..42.85 rows=1 width=70) 
 (actual time=346567.665..346567.665 rows=0 loops=1)
Recheck Cond: ((assignment = $2) AND (owner = $1))
-  BitmapAnd  (cost=38.84..38.84 rows=1 width=0) (actual 
 time=346567.642..346567.642 rows=0 loops=1)
  -  Bitmap Index Scan on submissions_assignment_idx  
 (cost=0.00..19.27 rows=177 width=0) (actual time=0.038..0.038 rows=2 loops=1)
Index Cond: (assignment = $2)
  -  Bitmap Index Scan on submissions_owner_idx  (cost=0.00..19.32 
 rows=184 width=0) (actual time=346566.501..346566.501 rows=28977245 loops=1)
Index Cond: (owner = $1)
  Total runtime: 346567.757 ms
 (8 rows)


Now, if I run it without preparing it--just run it directly in the 
commandline--I get this plan:
 db_alpha=# explain analyze SELECT me.id, me.assignment, me.title, 
 me.x_firstname, me.x_lastname, me.owner, me.node, me.grade, me.folder, 
 me.word_count, me.char_length, me.char_count, me.page_count FROM submissions 
 me WHERE ( ( owner = -1 AND me.assignment = 8996557 ) )
 db_alpha-# ;
  QUERY 
 PLAN  
 -
  Index Scan using submissions_assignment_idx on submissions me  
 (cost=0.00..549.15 rows=36 width=70) (actual time=0.021..0.021 rows=0 loops=1)
Index Cond: (assignment = 8996557)
Filter: (owner = (-1))
  Total runtime: 0.042 ms
 (4 rows)

submissions has ~124 million rows, and owner -1 is a placeholder in my 
database, to fulfill a foreign key requirement.  I tried REINDEXing 
submissions_owner_idx and performing a VACUUM ANALYZE on the submissions table, 
but nothing seems to make a difference for this query.  One other thing to note 
is that if I use any other value for the owner column, it comes back really 
fast ( 0.04 ms).

Any ideas why the query planner chooses a different query plan when using 
prepared statements?

--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] PostgreSQL with Zabbix - problem of newbe

2010-04-08 Thread Richard Yen
Kind of off-topic, but I've found that putting the history table on a separate 
spindle (using a separate tablespace) also helps improve performance.

--Richard



On Apr 8, 2010, at 12:44 PM, Robert Haas wrote:

 2010/4/8 Merlin Moncure mmonc...@gmail.com:
 previous to 8.2, to get good performance on zabbix you need to
 aggressively vacuum the heavily updated tables yourself.
 
 Generally if you DON'T vacuum aggressively enough, then vacuums will
 take a really long and painful amount of time, perhaps accounting for
 the hang the OP observed.  There's really no help for it but to
 sweat it out once, and then do it frequently enough afterward that it
 doesn't become a problem.
 
 ...Robert
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] why does swap not recover?

2010-03-28 Thread Richard Yen

On Mar 26, 2010, at 5:25 PM, Scott Carey wrote:
 Linux until recently does not account for shared memory properly in its swap 
 'aggressiveness' decisions.
 Setting shared_buffers larger than 35% is asking for trouble.
 
 You could try adjusting the 'swappiness' setting on the fly and seeing how it 
 reacts, but one consequence of that is trading off disk swapping for kswapd 
 using up tons of CPU causing other trouble.
Thanks for the tip.  I believe we've tried tuning the 'swappiness' setting on 
the fly, but it had no effect.  We're hypothesizing that perhaps 'swappiness' 
only comes into effect at the beginning of a process, so we would have to 
restart the daemon to actually make it go into effect--would you know about 
this?

 Either use one of the last few kernel versions (I forget which addressed the 
 memory accounting issues, and haven't tried it myself), or turn 
 shared_buffers down.  I recommend trying 10GB or so to start.

We're currently using CentOS 2.6.18-164.6.1.el5 with all the default settings.  
If this is after the one that dealt with memory accounting issues, I agree that 
I'll likely have to lower my shared_buffers.

My sysctl.conf shows the following:
 kernel.msgmnb = 65536
 kernel.msgmax = 65536
 kernel.shmmax = 68719476736
 kernel.shmall = 4294967296

BTW, I forgot to mention that I'm using FusionIO drives for my data storage, 
but I'm pretty sure this is not relevant to the issue I'm having.

Thanks for the help!
--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] why does swap not recover?

2010-03-26 Thread Richard Yen
Hi everyone,

We've recently encountered some swapping issues on our CentOS 64GB Nehalem 
machine, running postgres 8.4.2.  Unfortunately, I was foolish enough to set 
shared_buffers to 40GB.  I was wondering if anyone would have any insight into 
why the swapping suddenly starts, but never recovers?

img src=http://richyen.com/i/swap.png;

Note, the machine has been up and running since mid-December 2009.  It was only 
a March 8 that this swapping began, and it's never recovered.

If we look at dstat, we find the following:

img src=http://richyen.com/i/dstat.png;

Note that it is constantly paging in, but never paging out.  This would 
indicate that it's constantly reading from swap, but never writing out to it.  
Why would postgres do this? (postgres is pretty much the only thing running on 
this machine).

I'm planning on lowering the shared_buffers to a more sane value, like 25GB 
(pgtune recommends this for a Mixed-purpose machine) or less (pgtune recommends 
14GB for an OLTP machine).  However, before I do this (and possibly resolve the 
issue), I was hoping to see if anyone would have an explanation for the 
constant reading from swap, but never writing back.

--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] good effective_io_concurrency for FusionIO drives?

2010-03-25 Thread Richard Yen
Hello,

Wondering what's a good value for effective_io_concurrency when dealing with 
FusionIO drives...anyone have any experience with this?

I know that SSDs vary from 10 channels to 30, and that 1 SSD about as fast as a 
4-drive RAID, but I can't seem to settle on a good value to use for 
effective_io_concurrency.  Has anyone done any performance testing/metrics with 
the value?  Any recommendations or thoughts?

--Richard
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] CHECK constraint fails when it's not supposed to

2009-08-04 Thread Richard Yen

Hi All,

I encountered an odd issue regarding check constraints complaining  
when they're not really violated.


For this particular machine, I am running 8.3.7, but on a machine  
running 8.3.5, it seems to have succeeded.  I also upgraded a third  
machine from 8.3.5 to 8.3.7, and the query succeeded (so I'm thinking  
it's not related to different postgres versions)


I have a table called m_class and the definition is something like  
this:



tii=# \d m_class
  Table public.m_class
 Column  |   Type
|  Modifiers
-+-- 
+--
 id  | integer  | not null  
default nextval(('m_class_id_seq'::text)::regclass)

 class_type  | smallint | not null
 title   | character varying(100)   | not null
...snip...
date_setup  | timestamp with time zone | not null  
default ('now'::text)::date

 date_start  | timestamp with time zone | not null
 date_end| timestamp with time zone | not null
 term_length | interval | not null  
default '5 years'::interval

...snip...
 max_portfolio_file_size | integer  |
Indexes:
m_class_pkey PRIMARY KEY, btree (id)
m_class_account_idx btree (account)
m_class_instructor_idx btree (instructor)
Check constraints:
end_after_start_check CHECK (date_end = date_start)
end_within_term_length CHECK (date_end = (date_start +  
term_length))
min_password_length_check CHECK  
(length(enrollment_password::text) = 4)

positive_term_length CHECK (term_length  '00:00:00'::interval)
start_after_setup_check CHECK (date_start = date_setup)
...snip...


When I run my update, it fails:
tii=# begin; update only public.m_class set date_end='2009-09-03  
05:38:24.030331-07',term_length='177 days 17:59:09.868431' where  
id='2652020';

BEGIN
ERROR:  new row for relation m_class violates check constraint  
end_within_term_length

tii=# rollback;
ROLLBACK


The data reads:
tii=# select date_start, date_end, term_length, '2009-09-03  
05:38:24.030331-07'::timestamptz - date_start AS new_term_length  
from m_class where id = 2652020;
 date_start  |  date_end   |  
term_length | new_term_length
-+- 
+-+--
 2009-03-09 11:39:14.1619-07 | 2009-04-08 11:39:14.1619-07 | 30  
days | 177 days 17:59:09.868431



Based on new_term_length, the update should succeed.  However, it  
doesn't.  Would anyone have an explanation?


Thanks for your help!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] PQisBusy behaving strangely

2009-05-21 Thread Richard Yen

Hi All,

Not sure if this is the right pgsql-* channel to post to, but I was  
hoping maybe someone could answer a question from one of my fellow  
developers.  Please read below:


So, following the documentation, we wrote a little ASYNC version of  
exec. here is the code:


PGresult *PGClient::exec(char *query) {
 PGresult *result;

 if (conn == NULL) {
   ERROR(PGClient calling exec when a connection hasn't been  
established yet);

   return NULL;
 }
 if (PQsendQuery(conn,query)==0) {
   ERROR(PGClient wasn't able to sendQuery);
   return NULL;
 }
 int socketFD = PQsocket(conn);
 pollfd pfd;
 pfd.fd = socketFD;
 pfd.events = POLLIN|POLLRDHUP;
consumeSome:
 if (poll(pfd,1,PG_EXEC_TIMEOUT*1000) == 0) {
   ERROR(PGClient is timing out );
   return NULL;
 }
 if (PQconsumeInput(conn)==0) {
   ERROR(PGClient detected trouble when trying to consumeInput);
   return NULL;
 }
 if (PQisBusy(conn))
   goto consumeSome;
 result = PQgetResult(conn);
 if (PQisBusy(conn)) {
   // something is wrong, because this is telling us that there is  
more data on the way,

but there shouldn't be.
   ERROR(PGClient had a weird case where more data is on its way);
 }
 PGresult *tempResult=PQgetResult(conn);
 if (tempResult!=0) {
   // we either had multiple SQL queries that return multiple  
results or something weird

happened here, caller should close connection
   ERROR(PGClient had a weird case where multiple results were  
returned);

 }
 return result;
}

So, assuming that every query we pass in is just a simple, 1 result- 
set-returning query, we should never see PQisBusy returning a non- 
NULL after we do the PQgetResult. BUT every now and then, in our  
code, the PQisBusy function returns a non-NULL and we get that  
ERROR(PGClient had a weird case where more data is on its way)...  
BUT when we do tempResult=PQgetResult, then it is ALWAYS null... can  
someone please explain how PQisBusy can return true (when we aren't  
expecting it to) but then PQgetResult returns nothing?


If there's any knowledge of why PQisBusy returns not-null, yet nothign  
comes our of PQgetResult, could you let me know what's going on?


Any help much appreciated!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query plan with index having a btrim is different for strings of different length

2008-12-10 Thread Richard Yen

On Dec 9, 2008, at 3:27 PM, Tom Lane wrote:


Richard Yen [EMAIL PROTECTED] writes:

I've discovered a peculiarity with using btrim in an index and was
wondering if anyone has any input.


What PG version is this?


This is running on 8.3.3


In particular, I'm wondering if it's one of the early 8.2.x releases,
which had some bugs in and around choose_bitmap_and() that caused
them to sometimes make weird choices of indexes in a BitmapAnd plan
structure ...

Like David, I'm pretty dubious that the behavior has anything to do  
with

strings being 5 characters long exactly; but it could very much depend
on whether the string you choose is a common last name or not.  That
would change the estimated number of matching rows and hence affect  
the

apparent relative attractiveness of different indexes.



You guys are right.  I tried Miller and gave me the same result.  Is  
there any way to tune this so that for the common last names, the  
query run time doesn't jump from 1s to 300s?

Thanks for the help!
--Richard 


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query plan with index having a btrim is different for strings of different length

2008-12-10 Thread Richard Yen

On Dec 10, 2008, at 11:34 AM, Tom Lane wrote:


Richard Yen [EMAIL PROTECTED] writes:
You guys are right.  I tried Miller and gave me the same result.   
Is

there any way to tune this so that for the common last names, the
query run time doesn't jump from 1s to 300s?


If the planner's estimation is that far off then there must be  
something
very weird about the table statistics, but you haven't given us any  
clue

what.


Wow, thanks for helping me out here.  I don't have much experience  
with deconstructing queries and working with stats, so here's what I  
could gather.  If you need more information, please let me know.


tii=# select * from pg_stat_all_tables where relname =  
'm_object_paper' or relname = 'm_assignment';

-[ RECORD 1 ]+--
relid| 17516
schemaname   | public
relname  | m_assignment
seq_scan | 274
seq_tup_read | 1039457272
idx_scan | 372379230
idx_tup_fetch| 2365235708
n_tup_ins| 5641638
n_tup_upd| 520684
n_tup_del| 30339
n_tup_hot_upd| 406929
n_live_tup   | 5611665
n_dead_tup   | 11877
last_vacuum  |
last_autovacuum  | 2008-12-04 17:44:57.309717-08
last_analyze | 2008-10-20 15:09:50.943652-07
last_autoanalyze | 2008-08-15 17:16:14.588153-07
-[ RECORD 2 ]+--
relid| 17792
schemaname   | public
relname  | m_object_paper
seq_scan | 83613
seq_tup_read | 184330159906
idx_scan | 685219945
idx_tup_fetch| 222892138627
n_tup_ins| 71564825
n_tup_upd| 27558792
n_tup_del| 3058
n_tup_hot_upd| 22410985
n_live_tup   | 71559627
n_dead_tup   | 585467
last_vacuum  | 2008-10-24 14:36:45.134936-07
last_autovacuum  | 2008-12-05 07:02:40.52712-08
last_analyze | 2008-11-25 14:42:04.185798-08
last_autoanalyze | 2008-08-15 17:20:28.42811-07

tii=# select * from pg_statio_all_tables where relname =  
'm_object_paper' or relname = 'm_assignment';

-[ RECORD 1 ]---+---
relid   | 17516
schemaname  | public
relname | m_assignment
heap_blks_read  | 22896372
heap_blks_hit   | 1753777105
idx_blks_read   | 7879634
idx_blks_hit| 1157729592
toast_blks_read | 0
toast_blks_hit  | 0
tidx_blks_read  | 0
tidx_blks_hit   | 0
-[ RECORD 2 ]---+---
relid   | 17792
schemaname  | public
relname | m_object_paper
heap_blks_read  | 2604944369
heap_blks_hit   | 116307527781
idx_blks_read   | 133534908
idx_blks_hit| 3601637440
toast_blks_read | 0
toast_blks_hit  | 0
tidx_blks_read  | 0
tidx_blks_hit   | 0

Also, yes, we've kicked around the idea of doing an index on the  
concatenation of the first and last names--that would definitely be  
more unique, and I think we're actually going to move to that.  Just  
thought I'd dig deeper here to learn more.


Thanks!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query plan with index having a btrim is different for strings of different length

2008-12-10 Thread Richard Yen

On Dec 10, 2008, at 11:39 AM, Robert Haas wrote:

You guys are right.  I tried Miller and gave me the same result.   
Is there
any way to tune this so that for the common last names, the query  
run time

doesn't jump from 1s to 300s?
Thanks for the help!


Can you send the output of EXPLAIN ANALYZE for both cases?



tii=# explain analyze SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smithers'));
   
QUERY PLAN

---
 Nested Loop  (cost=0.00..10141.07 rows=1 width=4) (actual  
time=33.004..33.004 rows=0 loops=1)
   -  Index Scan using last_name_fnc_idx on m_object_paper   
(cost=0.00..10114.25 rows=11 width=8) (actual time=33.003..33.003  
rows=0 loops=1)
 Index Cond: (lower(btrim((x_lastname)::text)) =  
'smithers'::text)
 Filter: ((owner = (-1)) AND  
(lower(btrim((x_firstname)::text)) = 'jordan'::text))
   -  Index Scan using m_assignment_pkey on m_assignment   
(cost=0.00..2.43 rows=1 width=4) (never executed)

 Index Cond: (m_assignment.id = m_object_paper.assignment)
 Filter: (m_assignment.class = 2450798)
 Total runtime: 33.070 ms
(8 rows)

tii=# explain analyze SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smith'));
QUERY 
 PLAN

--
 Hash Join  (cost=181867.87..291714.78 rows=1 width=4) (actual  
time=124746.426..139252.850 rows=1 loops=1)

   Hash Cond: (m_object_paper.assignment = m_assignment.id)
   -  Bitmap Heap Scan on m_object_paper  (cost=181687.88..291532.67  
rows=562 width=8) (actual time=124672.328..139248.919 rows=58 loops=1)
 Recheck Cond: ((lower(btrim((x_lastname)::text)) =  
'smith'::text) AND (owner = (-1)))

 Filter: (lower(btrim((x_firstname)::text)) = 'jordan'::text)
 -  BitmapAnd  (cost=181687.88..181687.88 rows=112429  
width=0) (actual time=124245.890..124245.890 rows=0 loops=1)
   -  Bitmap Index Scan on last_name_fnc_idx   
(cost=0.00..5476.30 rows=496740 width=0) (actual  
time=16194.803..16194.803 rows=521382 loops=1)
 Index Cond: (lower(btrim((x_lastname)::text)) =  
'smith'::text)
   -  Bitmap Index Scan on m_object_paper_owner_idx   
(cost=0.00..176211.04 rows=16061244 width=0) (actual  
time=107928.054..107928.054 rows=15494737 loops=1)

 Index Cond: (owner = (-1))
   -  Hash  (cost=177.82..177.82 rows=174 width=4) (actual  
time=3.764..3.764 rows=5 loops=1)
 -  Index Scan using m_assignment_class_idx on m_assignment   
(cost=0.00..177.82 rows=174 width=4) (actual time=0.039..3.756 rows=5  
loops=1)

   Index Cond: (class = 2450798)
 Total runtime: 139255.109 ms
(14 rows)

This example doesn't have a  300s run time, but there are a few in my  
log that are.  In either case, I guess you get the picture.


Thanks for the help!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] query plan with index having a btrim is different for strings of different length

2008-12-10 Thread Richard Yen

On Dec 10, 2008, at 4:08 PM, Tom Lane wrote:


Richard Yen [EMAIL PROTECTED] writes:
Is there any way to tune this so that for the common last names,  
the query

run time doesn't jump from 1s to 300s?


Well, as near as I can tell there's factor of a couple hundred
difference between the frequencies of 'smith' and 'smithers', so
you shouldn't really expect similar runtimes for the two cases.

Having said that, I still think you should try to index both first
and last name.  Also I wonder whether the index on owner is worth
having at all.  It definitely doesn't seem worthwhile to index the
entries with owner = -1, since there are so many; so maybe you could
make it a partial index that excludes those entries, in order to  
prevent

the planner from trying to use it for this case.


Created the 2-col index, and the query runs much faster on all  
permutations.


Thanks much for all your help,
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] query plan with index having a btrim is different for strings of different length

2008-12-09 Thread Richard Yen

Hi,

I've discovered a peculiarity with using btrim in an index and was  
wondering if anyone has any input.


My table is like this:
 Table public.m_object_paper
   Column|  Type  |   Modifiers
-++
 id  | integer| not null
 title   | character varying(200) | not null
 x_firstname | character varying(50)  |
 x_lastname  | character varying(50)  |
...snip...
 page_count  | smallint   |
 compare_to_database | bit varying| not null
Indexes:
m_object_paper_pkey PRIMARY KEY, btree (id)
last_name_fnc_idx btree (lower(btrim(x_lastname::text)))
m_object_paper_assignment_idx btree (assignment)
m_object_paper_owner_idx btree (owner) CLUSTER
...snip to end...

My query is like this:
SELECT m_object_paper.id FROM m_object_paper, m_assignment WHERE  
m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim($FIRSTNAME)) and  
lower(btrim(x_lastname)) = lower(btrim($LASTNAME));


Strangely, if $LASTNAME is 5 chars, the query plan looks like this:
tii=# explain SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smith'));

  QUERY PLAN
---
 Hash Join  (cost=181704.85..291551.77 rows=1 width=4)
   Hash Cond: (m_object_paper.assignment = m_assignment.id)
   -  Bitmap Heap Scan on m_object_paper  (cost=181524.86..291369.66  
rows=562 width=8)
 Recheck Cond: ((lower(btrim((x_lastname)::text)) =  
'smith'::text) AND (owner = (-1)))

 Filter: (lower(btrim((x_firstname)::text)) = 'jordan'::text)
 -  BitmapAnd  (cost=181524.86..181524.86 rows=112429 width=0)
   -  Bitmap Index Scan on last_name_fnc_idx   
(cost=0.00..5468.29 rows=496740 width=0)
 Index Cond: (lower(btrim((x_lastname)::text)) =  
'smith'::text)
   -  Bitmap Index Scan on m_object_paper_owner_idx   
(cost=0.00..176056.04 rows=16061244 width=0)

 Index Cond: (owner = (-1))
   -  Hash  (cost=177.82..177.82 rows=174 width=4)
 -  Index Scan using m_assignment_class_idx on m_assignment   
(cost=0.00..177.82 rows=174 width=4)

   Index Cond: (class = 2450798)
(13 rows)

However, if $LASTNAME is != 5 chars (1 char, 100 chars, doesn't make a  
difference), the query plan looks like this:
tii=# explain SELECT m_object_paper.id FROM m_object_paper,  
m_assignment WHERE m_object_paper.assignment = m_assignment.id AND  
m_object_paper.owner=-1 AND m_assignment.class = 2450798 AND  
lower(btrim(x_firstname)) = lower(btrim('Jordan')) and  
lower(btrim(x_lastname)) = lower(btrim('Smithers'));

QUERY PLAN
---
 Nested Loop  (cost=0.00..10141.06 rows=1 width=4)
   -  Index Scan using last_name_fnc_idx on m_object_paper   
(cost=0.00..10114.24 rows=11 width=8)
 Index Cond: (lower(btrim((x_lastname)::text)) =  
'smithers'::text)
 Filter: ((owner = (-1)) AND  
(lower(btrim((x_firstname)::text)) = 'jordan'::text))
   -  Index Scan using m_assignment_pkey on m_assignment   
(cost=0.00..2.43 rows=1 width=4)

 Index Cond: (m_assignment.id = m_object_paper.assignment)
 Filter: (m_assignment.class = 2450798)
(7 rows)

In practice, the difference is 300+ seconds when $LASTNAME == 5 chars  
and 1 second when $LASTNAME != 5 chars.


Would anyone know what's going on here?  Is there something about the  
way btrim works, or perhaps with the way indexes are created?  It's  
strange that the query plan would change for just one case (Jones,  
Smith, Brown, etc., all cause the query plan to use that extra  
heap scan).


Thanks for any help!
--Richard

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] postgres memory management issues?

2007-09-07 Thread Richard Yen

Hi All,

I've recently run into problems with my kernel complaining that I ran  
out of memory, thus killing off postgres and bringing my app to a  
grinding halt.


I'm on a 32-bit architecture with 16GB of RAM, under Gentoo Linux.   
Naturally, I have to set my shmmax to 2GB because the kernel can't  
support more (well, I could set it to 3GB, but I use 2GB for safety).


Shared_buffers is 20 and max_connections is 600.

Here is a snippet of my log output (I can give more if necessary):
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill  
process 11696 (postgres) score 1181671 and children.
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com Out of Memory: Kill  
process 11696 (postgres) score 1181671 and children.
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com Out of memory: Killed  
process 11704 (postgres).
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com Out of memory: Killed  
process 11704 (postgres).

[...]
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com postgres[11696]: [6-1]  
2007-09-05 18:38:57.626 PDT [user=,db=  PID:11696 XID:]LOG:   
background writer process (PID 11704) was terminated by signal 9
Sep  5 18:38:57 tii-db2.oaktown.iparadigms.com postgres[11696]: [7-1]  
2007-09-05 18:38:57.626 PDT [user=,db=  PID:11696 XID:]LOG:   
terminating any other active server processes


My understanding is that if any one postgres process's memory usage,  
plus the shared memory, exceeds the kernel limit of 4GB, then the  
kernel will kill the process off.  Is this true?  If so, would  
postgres have some prevention mechanism that would keep a particular  
process from getting too big?  (Maybe I'm being too idealistic, or I  
just simply don't understand how postgres works under the hood)


--Richard

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster