Re: [PERFORM] slow query plans caused by under-estimation of CTE cardinality

2013-02-19 Thread Tom Lane
John Lumby johnlu...@hotmail.com writes:
 Meanwhile,   I have one other suggestion aimed specifically at problematic 
 CTEs:
 Would it be reasonable to provide a new Planner Configuration option  :

   enable_nestloop_cte_inner (boolean)
   Enables or disables the query planner's use of nested-loop join plans in 
 which a CTE is the inner.

Sounds pretty badly thought out to me.  There might be some cases where
this would help, but there would be many more where it would be useless
or counterproductive.

The case that was discussed in the previous thread looked like it could
be addressed by teaching the planner to drill down into CTEs to find
variable referents, as it already does for subquery RTEs (cf
examine_simple_variable in selfuncs.c).  I'm not sure if your case is
similar or not --- you didn't provide any useful amount of detail.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Speed of exist

2013-02-19 Thread Bastiaan Olij
Hi Pavel,

That is what I've done in this particular case but there are parts where
I use exist checks in this way that are very cumbersome to write out
like that so I'm hoping there is a way to make the optimizer work with
existence checks in this way.

Cheers,

Bastiaan Olij

On 19/02/13 6:39 PM, Pavel Stehule wrote:
 2013/2/19 Bastiaan Olij basti...@basenlily.me:
 Hi Andy,

 I've tried that with the same result. One subquery works beautifully,
 two subqueries with an OR and it starts to do a sequential scan...
 try to rewrite OR to two SELECTs joined by UNION ALL

 Pavel




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-19 Thread Josh Berkus
On 02/18/2013 08:28 PM, Mark Kirkwood wrote:
 Might be worth looking at your vm.dirty_ratio, vm.dirty_background_ratio
 and friends settings. We managed to choke up a system with 16x SSD by
 leaving them at their defaults...

Yeah?  Any settings you'd recommend specifically?  What did you use on
the SSD system?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Very slow update statement on 40mio rows

2013-02-19 Thread Merlin Moncure
On Fri, Feb 15, 2013 at 9:32 AM, Florian Schröck fschro...@aycan.de wrote:
 Hello Kevin,
 not updating every row which doesn't need the update solved the problem!
 Your query took only 1 minute. :)

 Thank you so much for the fast response, have a great weekend!

 PS: When you switch to TEXT on the explain URL you can see the final
 runtime which was 66 minutes with the original statement.

 Best regards,
 Florian

 On 02/15/2013 03:59 PM, Kevin Grittner wrote:
 Florian Schröck fschro...@aycan.de wrote:

 UPDATE BackupFiles
SET nTapeNr=0, nAFIOCounter=0, nBlockCounter=0,
cStatus='NEW'::StatusT, bOnSetBlue=false,
bOnSetYellow=false, nLastBackupTS= '0001-01-01 00:00:00'
WHERE cStatus='NEW' OR cStatus='WRITING' OR cStatus='ONTAPE';

 Explain analyze: http://explain.depesz.com/s/8y5
 The statement takes 60-90 minutes.
 The EXPLAIN ANALYZE at that URL shows a runtime of 3 minutes and 41
 seconds.

 I tried to optimize the settings but until now without success.

 Can we optimize this update statement somehow? Do you have any
 other ideas?
 Are there any rows which would already have the values that you are
 setting?  If so, it would be faster to skip those by using this
 query:

 UPDATE BackupFiles
   SET nTapeNr=0, nAFIOCounter=0, nBlockCounter=0,
   cStatus='NEW'::StatusT, bOnSetBlue=false,
   bOnSetYellow=false, nLastBackupTS= '0001-01-01 00:00:00'
   WHERE (cStatus='NEW' OR cStatus='WRITING' OR cStatus='ONTAPE')
 AND (nTapeNr  0 OR nAFIOCounter  0 OR nBlockCounter  0
  OR cStatus  'NEW'::StatusT
  OR bOnSetBlue IS DISTINCT FROM false
  OR bOnSetYellow IS DISTINCT FROM false
  OR nLastBackupTS  '0001-01-01 00:00:00');

 Another way to accomplish this is with the
 suppress_redundant_updates_trigger() trigger function:

 http://www.postgresql.org/docs/9.2/interactive/functions-trigger.html

if the number of rows you actually update is not very large relative
to size of the table, just for fun, try this:


CREATE OR REPLACE FUNCTION BakupFilesCandidateReset(BackupFiles)
RETURNS BOOL AS
$$
  SELECT ($1).cStatus IN('NEW', 'WRITING', 'ONTAPE')
  AND
(($1).nTapeNr, ($1).nAFIOCounter, ($1).nBlockCounter,
($1).cStatus, ($1).bOnSetBlue, ($1).bOnSetYellow, ($1).nLastBackupTS)
IS DISTINCT FROM /* simple != will suffice if values are never null */
   (0, 0, 0, 'NEW'::StatusT, false, false, '0001-01-01 00:00:00');
$$ LANGUAGE SQL IMMUTABLE;

CREATE INDEX ON BackupFiles(BakupFilesCandidateReset(BackupFiles))
  WHERE BakupFilesCandidateReset(BackupFiles);


SELECT * FROM BackupFiles WHERE BakupFilesCandidateReset(BackupFiles);
UPDATE BackupFiles SET ... WHERE BakupFilesCandidateReset(BackupFiles);
etc

idea here is to maintain partial boolean index representing candidate
records to update.  plus it's nifty.   this is basic mechanism that
can be used as foundation for very fast push pull queues.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-19 Thread Mark Kirkwood

On 20/02/13 06:51, Josh Berkus wrote:

On 02/18/2013 08:28 PM, Mark Kirkwood wrote:

Might be worth looking at your vm.dirty_ratio, vm.dirty_background_ratio
and friends settings. We managed to choke up a system with 16x SSD by
leaving them at their defaults...

Yeah?  Any settings you'd recommend specifically?  What did you use on
the SSD system?



We set:

vm.dirty_background_ratio = 0
vm.dirty_background_bytes = 1073741824
vm.dirty_ratio = 0
vm.dirty_bytes = 2147483648

i.e 1G for dirty_background and 2G for dirty. We didn't spend much time 
afterwards fiddling with the size much. I'm guessing the we could have 
made it bigger - however the SSD were happier to be constantly writing a 
few G than being handed (say) 50G of buffers to write at once . The 
system has 512G of ram and 32 cores (no hyperthreading).


regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-19 Thread Josh Berkus
On 02/19/2013 09:51 AM, Josh Berkus wrote:
 On 02/18/2013 08:28 PM, Mark Kirkwood wrote:
 Might be worth looking at your vm.dirty_ratio, vm.dirty_background_ratio
 and friends settings. We managed to choke up a system with 16x SSD by
 leaving them at their defaults...
 
 Yeah?  Any settings you'd recommend specifically?  What did you use on
 the SSD system?
 

NM, I tested lowering dirty_background_ratio, and it didn't help,
because checkpoints are kicking in before pdflush ever gets there.

So the issue seems to be that if you have this combination of factors:

1. large RAM
2. many/fast CPUs
3. a database which fits in RAM but is larger than the RAID controller's
WB cache
4. pg_xlog on the same volume as pgdata

... then you'll see checkpoint stalls and spread checkpoint will
actually make them worse by making the stalls longer.

Moving pg_xlog to a separate partition makes this better.  Making
bgwriter more aggressive helps a bit more on top of that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-19 Thread Mark Kirkwood

On 20/02/13 12:24, Josh Berkus wrote:


NM, I tested lowering dirty_background_ratio, and it didn't help,
because checkpoints are kicking in before pdflush ever gets there.

So the issue seems to be that if you have this combination of factors:

1. large RAM
2. many/fast CPUs
3. a database which fits in RAM but is larger than the RAID controller's
WB cache
4. pg_xlog on the same volume as pgdata

... then you'll see checkpoint stalls and spread checkpoint will
actually make them worse by making the stalls longer.

Moving pg_xlog to a separate partition makes this better.  Making
bgwriter more aggressive helps a bit more on top of that.



We have pg_xlog on a pair of PCIe SSD. Also we running the deadline io 
scheduler.


Regards

Mark


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] High CPU usage / load average after upgrading to Ubuntu 12.04

2013-02-19 Thread Scott Marlowe
On Tue, Feb 19, 2013 at 4:24 PM, Josh Berkus j...@agliodbs.com wrote:
 ... then you'll see checkpoint stalls and spread checkpoint will
 actually make them worse by making the stalls longer.

Wait, if they're spread enough then there won't be a checkpoint, so to
speak.  Are you saying that spreading them out means that they still
kind of pile up, even with say a completion target of 1.0 etc?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance