from:"Kenneth Marshall"

Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall

On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote:
> Hi Babu,
> 
> That was all already done, as it is common practice for JDBC. Your
> parameter was added to the code that already did all that - and worked
> brilliantly there ;)
> 
Hi Frits,

What was the parameter? I did not see an Email in the thread from Babu.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall

On Fri, Jun 09, 2017 at 02:39:37PM +, Frits Jalvingh wrote:
> Hi all,
> 
> Thanks a lot for the many responses!
> 
> About preparing statements: this is done properly in Java, and pgsql does
> it by itself. So that cannot be done better ;)
> 
> I tried the copy command, and that indeed works quite brilliantly:
> Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per
> second
> 
> That's faster than Oracle. But with a very bad interface I have to say for
> normal database work.. I will try to make this work in the tooling, but it
> needs some very special code to format all possible values properly, and to
> manage the end of the copy, so it is not usable in general which is a pity,
> I think.
> 
> So, I am still very interested in getting normal inserts faster, because
> that will gain speed for all work.. If Oracle can do it, and Postgres is
> able to insert fast with copy- where lies the bottleneck with the insert
> command? There seems to be quite a performance hit with the JDBC driver
> itself (as the stored procedure is a lot faster), so I can look into that.
> But even after that there is quite a gap..
> 
> Regards,
> 
> Frits

Hi Frits,

Have you looked at UNLOGGED tables and also having more that 1 insert
stream running at a time. Sometimes multiple parallel inserts can be
faster.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall

On Fri, Jun 09, 2017 at 01:56:58PM +, Frits Jalvingh wrote:
> Hi Kenneth, Andreas,
> 
> Thanks for your tips!
> 
> I increased shared_buffers to 8GB but it has no measurable effect at all. I
> think that is logical: shared buffers are important for querying but not
> for inserting; for that the speed to write to disk seems most important- no
> big reason to cache the data if the commit requires a full write anyway.
> I also changed the code to do only one commit; this also has no effect I
> can see.
> 
> It is true that Oracle had more memory assigned to it (1.5G), but unlike
> Postgres (which is completely on a fast SSD) Oracle runs on slower disk
> (ZFS)..
> 
> I will try copy, but I first need to investigate how to use it- its
> interface seems odd to say the least ;) I'll report back on that once done.
> 
> Any other tips would be welcome!
> 
> Regards,
> 
> Frits

Hi Frits,

Here is an article that is still valid:

https://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Improving PostgreSQL insert performance

2017-06-09 Thread Kenneth Marshall

On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote:
> 
> 
> Am 09.06.2017 um 15:04 schrieb Frits Jalvingh:
> >Hi all,
> >
> >I am trying to improve the runtime of a big data warehouse
> >application. One significant bottleneck found was insert
> >performance, so I am investigating ways of getting Postgresql to
> >insert data faster.
> 
> * use COPY instead of Insert, it is much faster
> * bundle all Insert into one transaction
> * use a separate disk/spindel for the transaction log
> 
> 
> 
> >
> >I already changed the following config parameters:
> >work_mem 512MB
> >synchronous_commit off
> >shared_buffers 512mb
> >commit_delay 10
> >autovacuum_naptime 10min
> >
> >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K
> >with 16GB memory and an Intel 750 SSD. JDBC driver is
> >postgresql-42.1.1.
> >
> 
> increase shared_buffers, with 16gb ram i would suggest 8gb

+1 Without even checking, I think Oracle is configured to use a LOT
more memory than 512mb.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] REINDEX takes half a day (and still not complete!)

2011-04-30 Thread Kenneth Marshall

On Sat, Apr 30, 2011 at 05:26:36PM +0800, Phoenix Kiula wrote:
 On Sat, Apr 30, 2011 at 4:07 PM, Greg Smith g...@2ndquadrant.com wrote:
  On 04/23/2011 03:44 PM, Robert Haas wrote:
 
  On Apr 17, 2011, at 11:30 AM, Phoenix Kiulaphoenix.ki...@gmail.com
  ?wrote:
 
 
  Postgres is 8.2.9.
 
 
 
  An upgrade would probably help you a lot, and as others have said it
  sounds like your hardware is failing, so you probably want to deal with 
  that
  first.
 
  I am a bit surprised, however, that no one seems to have mentioned using
  CLUSTER rather than VACUUM or REINDEX. Sometimes that's worth a try...
 
 
  Don't know if it was for this reason or not for not mentioning it by others,
  but CLUSTER isn't so great in 8.2. ?The whole not MVCC-safe bit does not
  inspire confidence on a production server.
 
 
 
 
 To everyone. Thanks so much for everything, truly. We have managed to
 salvage the data by exporting it in bits and pieces.
 
 1. First the schema only
 2. Then pg_dump of specific small tables
 3. Then pg_dump of timed bits of the big mammoth table
 
 Not to jinx it, but the newer hardware seems to be doing well. I am on
 9.0.4 now and it's pretty fast.
 
 Also, as has been mentioned in this thread and other discussions on
 the list, just doing a dump and then fresh reload has compacted the DB
 to nearly 1/3rd of its previously reported size!
 
 I suppose that's what I am going to do on a periodic basis from now
 on. There is a lot of DELETE/UPDATE activity. But I wonder if the
 vacuum stuff really should do something that's similar in function?
 What do the high-end enterprise folks do -- surely they can't be
 dumping/restoring every quarter or soor are they?
 
 Anyway, many many thanks to the lovely folks on this list. Much appreciated!
 

The autovacuum and space management in 9.0 is dramatically more effective
and efficient then that of 8.2. Unless you have an odd corner-case there
really should be no reason for a periodic dump/restore. This is not your
grandmother's Oldsmobile... :)

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Time to put theory to the test?

2011-04-26 Thread Kenneth Marshall

On Tue, Apr 26, 2011 at 09:58:49AM -0500, Kevin Grittner wrote:
 J Sisson sisso...@gmail.com wrote:
  Rob Wultsch wult...@gmail.com wrote:
  Tip from someone that manages thousands of MySQL servers: Use
  InnoDB when using MySQL.
  
  Granted, my knowledge of PostgreSQL (and even MSSQL) far surpasses
  my knowledge of MySQL, but if InnoDB has such amazing benefits as
  being crash safe, and even speed increases in some instances, why
  isn't InnoDB default?
  
 Because it's not as fast as the unsafe ISAM implementation for most
 benchmarks.
  
 There is one minor gotcha in InnoDB (unless it's been fixed since
 2008): the release of locks is not atomic with the persistence of
 the data in the write-ahead log (which makes it S2PL but not SS2PL).
 So it is possible for another connection to see data that won't be
 there after crash recovery. This is justified as an optimization.
 Personally, I would prefer not to see data from other transactions
 until it has actually been successfully committed.
  
 -Kevin
 

In addition, their fulltext indexing only works with MyISAM tables.

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Bad Query Plan with Range Query

2011-04-15 Thread Kenneth Marshall

On Fri, Apr 15, 2011 at 10:17:32AM -0700, Mark Williams wrote:
 We are experiencing a problem with our query plans when using a range query 
 in Postgresql 8.3. The query we are executing attempts to select the 
 minimum primary key id after a certain date. Our date columns are bigint's 
 holding a unix epoch representation of the date. We have an index on the 
 primary key and the date column.

 For the following query just specified the predicate modificationDate = ?

 explain SELECT min(messageID) FROM Message WHERE modificationDate = 
 1302627793988;
QUERY PLAN
 -
  Result  (cost=2640.96..2640.97 rows=1 width=0)
InitPlan
  -  Limit  (cost=0.00..2640.96 rows=1 width=8)
-  Index Scan using message_pk on message  
 (cost=0.00..3298561.09 rows=1249 width=8)
  Filter: ((messageid IS NOT NULL) AND (modificationdate = 
 1302627793988::bigint))
 (5 rows)

 For some reason it is deciding to scan the primary key column of the table. 
 This results in scanning the entire table which is huge (10 million 
 records).

 However, if we specify a fake upper bound then the planner will correctly 
 use the date column index:

 explain SELECT min(messageID) FROM Message WHERE modificationDate = 
 1302627793988 and modificationDate  ;
  QUERY PLAN
 -
  Aggregate  (cost=9.64..9.65 rows=1 width=8)
-  Index Scan using jvmssg_mdate_idx on message  (cost=0.00..9.64 
 rows=1 width=8)
  Index Cond: ((modificationdate = 1302627793988::bigint) AND 
 (modificationdate  ::bigint))
 (3 rows)

 We have carried out all the usual maintenance tasks. We have increase the 
 statistics_target on both indexes to the maximum (1000) and performed a 
 vacuum analyze on the table. Our resource configurations are very good 
 since this is our production server.

 Interestingly this does not appear to happen with exactly the same database 
 when using 8.4. Instead we get the correct plan without having to add the 
 upper bound.

 Here is the full description of the the table. It contains upwards of 10 
 million rows.

   Table public.message
   Column  |  Type  | Modifiers
 --++---
  messageid| bigint | not null
  parentmessageid  | bigint |
  threadid | bigint | not null
  containertype| integer| not null
  containerid  | bigint | not null
  userid   | bigint |
  subject  | character varying(255) |
  body | text   |
  modvalue | integer| not null
  rewardpoints | integer| not null
  creationdate | bigint | not null
  modificationdate | bigint | not null
  status   | integer| not null
 Indexes:
 message_pk PRIMARY KEY, btree (messageid)
 jvmssg_cdate_idx btree (creationdate)
 jvmssg_cidctmd_idx btree (containerid, containertype, 
 modificationdate)
 jvmssg_mdate_idx btree (modificationdate)
 jvmssg_mdvle_idx btree (modvalue)
 jvmssg_prntid_idx btree (parentmessageid)
 jvmssg_thrd_idx btree (threadid)
 jvmssg_usrid_idx btree (userid)
 Referenced by:
 TABLE answer CONSTRAINT answer_mid_fk FOREIGN KEY (messageid) 
 REFERENCES message(messageid)
 TABLE messageprop CONSTRAINT jmp_msgid_fk FOREIGN KEY (messageid) 
 REFERENCES message(messageid)


 Any insight into this would be greatly appreciated. We are not able to 
 upgrade our databases to 8.4. We are reluctant to re-write all our range 
 queries if possible.


 -m


Here is the fix that was added to 8.4+:

http://archives.postgresql.org/pgsql-committers/2010-01/msg00021.php

I think you are stuck with one of those options so if upgrading
is not available, then re-writing the range queries wins by a landslide. :)

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Updating histogram_bounds after a delete

2011-03-17 Thread Kenneth Marshall

On Thu, Mar 17, 2011 at 09:49:45AM -0500, Kevin Grittner wrote:
 Derrick Rice derrick.r...@gmail.com wrote:
  Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  
  there is a feature to probe the end of an index's range in
  situations where data skew was often causing less than optimal
  plans to be chosen.
  
  Was this introduced in 9.0 or was it earlier?
  
 I don't remember when it was added.  I took a stab at searching for
 it, but didn't get it figured out; if nobody who knows off-hand
 jumps in, I'll try again when I have more time.
  

I think this is it:

http://archives.postgresql.org/pgsql-committers/2010-01/msg00021.php

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Help with Query Tuning

2011-03-16 Thread Kenneth Marshall

On Wed, Mar 16, 2011 at 02:43:38PM +0530, Adarsh Sharma wrote:
 Dear all,

 I am facing a problem while  creating the index to make the below query run 
 faster. My table  size is near about 1065 MB and 428467 rows.

 explain analyze select  count(*)  from page_content where publishing_date 
 like '%2010%' and content_language='en'  and content is not null and 
 isprocessable = 1 and (content like '%Militant%'
 OR content like '%jihad%' OR  content like '%Mujahid%'  OR
 content like '%fedayeen%' OR content like '%insurgent%'  OR content like 
 '%terrorist%' OR
  content like '%cadre%'  OR content like '%civilians%' OR content like 
 '%police%' OR content like '%defence%' OR content like '%cops%' OR content 
 like '%crpf%' OR content like '%dsf%' OR content like '%ssb%') AND (content 
 like '%kill%' or content like '%injure%');

 *Output:

 * Aggregate  (cost=107557.78..107557.79 rows=1 width=0) (actual 
 time=18564.631..18564.631 rows=1 loops=1)
   -  Seq Scan on page_content  (cost=0.00..107466.82 rows=36381 width=0) 
 (actual time=0.146..18529.371 rows=59918 loops=1)
 Filter: ((content IS NOT NULL) AND (publishing_date ~~ 
 '%2010%'::text) AND (content_language = 'en'::bpchar) AND (isprocessable = 
 1) AND (((content)
 ::text ~~ '%kill%'::text) OR ((content)::text ~~ '%injure%'::text)) AND 
 (((content)::text ~~ '%Militant%'::text) OR ((content)::text ~~ 
 '%jihad%'::text) OR (
 (content)::text ~~ '%Mujahid%'::text) OR ((content)::text ~~ 
 '%fedayeen%'::text) OR ((content)::text ~~ '%insurgent%'::text) OR 
 ((content)::text ~~ '%terrori
 st%'::text) OR ((content)::text ~~ '%cadre%'::text) OR ((content)::text ~~ 
 '%civilians%'::text) OR ((content)::text ~~ '%police%'::text) OR 
 ((content)::text
 ~~ '%defence%'::text) OR ((content)::text ~~ '%cops%'::text) OR 
 ((content)::text ~~ '%crpf%'::text) OR ((content)::text ~~ '%dsf%'::text) 
 OR ((content)::text
 ~~ '%ssb%'::text)))
 Total runtime: 18564.673 ms


 *Index on that Table :

 *CREATE INDEX idx_page_id
  ON page_content
  USING btree
  (crawled_page_id);

 *Index I create :*
 CREATE INDEX idx_page_id_content
  ON page_content
  USING btree
  (crawled_page_id,content_language,publishing_date,isprocessable);

 *Index that fail to create:

 *CREATE INDEX idx_page_id_content1
  ON page_content
  USING btree
  (crawled_page_id,content);

 Error :-ERROR:  index row requires 13240 bytes, maximum size is 8191
 ** Error **

 ERROR: index row requires 13240 bytes, maximum size is 8191
 SQL state: 54000

 How to resolve this error
 Please give any suggestion to tune the query.

 Thanks  best Regards,

 Adarsh Sharma


You should probably be looking at using full-text indexing:

http://www.postgresql.org/docs/9.0/static/textsearch.html

or limit the size of content for the index.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] ANTI-JOIN needs table, index scan not possible?

2011-03-11 Thread Kenneth Marshall

On Fri, Mar 11, 2011 at 06:54:39PM +0100, hans wulf wrote:
 Thanks for the answer.
 
 so there's no way around this problem? A nice index bitmap merge thing would 
 be super fast. Big table ANTI JOIN queries with only a few results expected, 
 are totally broken, if this is true. 
 
 This way the query breaks my neck. This is a massive downside of postgres 
 which makes this kind of query impossible. Mysql gives you the answer in a 
 few seconds :-(
 
 

Super! I am glad that MySQL can meet your needs. No software is
perfect and you should definitely chose based on your use-case.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Performance issues

2011-03-07 Thread Kenneth Marshall

On Mon, Mar 07, 2011 at 03:17:05PM +0100, Andreas For? Tollefsen wrote:
 Thanks, Ken.
 
 It seems like the tip to turn off synchronous_commit did the trick:
 
 /usr/lib/postgresql/8.4/bin/pgbench -T 60 test1
 starting vacuum...end.
 transaction type: TPC-B (sort of)
 scaling factor: 1
 query mode: simple
 number of clients: 1
 duration: 60 s
 number of transactions actually processed: 86048
 tps = 1434.123199 (including connections establishing)
 tps = 1434.183362 (excluding connections establishing)
 
 Is this acceptable compared to others when considering my setup?
 
 Cheers,
 Andreas
 


These are typical results for synchronous_commit off. The caveat
is you must be able to handle loosing transactions if you have a
database crash, but your database is still intact. This differs
from turning fsync off in which a crash means you would need to
restore from a backup.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Kenneth Marshall

On Thu, Feb 03, 2011 at 04:39:12PM -0800, da...@lang.hm wrote:
 On Thu, 3 Feb 2011, Robert Haas wrote:

 On Thu, Feb 3, 2011 at 3:54 PM,  da...@lang.hm wrote:
 with the current code, this is a completely separate process that knows
 nothing about the load, so if you kick it off when you start the load, it
 makes a pass over the table (competing for I/O), finishes, you continue 
 to
 update the table, so it makes another pass, etc. As you say, this is a 
 bad
 thing to do. I am saying to have an option that ties the two togeather,
 essentially making the data feed into the Analyze run be a fork of the 
 data
 comeing out of the insert run going to disk. So the Analyze run doesn't 
 do
 any I/O and isn't going to complete until the insert is complete. At 
 which
 time it will have seen one copy of the entire table.

 Yeah, but you'll be passing the entire table through this separate
 process that may only need to see 1% of it or less on a large table.
 If you want to write the code and prove it's better than what we have
 now, or some other approach that someone else may implement in the
 meantime, hey, this is an open source project, and I like improvements
 as much as the next guy.  But my prediction for what it's worth is
 that the results will suck.  :-)

 I will point out that 1% of a very large table can still be a lot of disk 
 I/O that is avoided (especially if it's random I/O that's avoided)

 David Lang


In addition, the streaming ANALYZE can provide better statistics at
any time during the load and it will be complete immediately. As far
as passing the entire table through the ANALYZE process, a simple
counter can be used to only send the required samples based on the
statistics target. Where this would seem to help the most is in
temporary tables which currently do not work with autovacuum but it
would streamline their use for more complicated queries that need
an analyze to perform well.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-04 Thread Kenneth Marshall

On Thu, Feb 03, 2011 at 09:05:52PM -0500, Robert Haas wrote:
 On Thu, Feb 3, 2011 at 8:37 PM,  da...@lang.hm wrote:
  On Thu, 3 Feb 2011, Robert Haas wrote:
 
  On Thu, Feb 3, 2011 at 7:39 PM, ?da...@lang.hm wrote:
 
  Yeah, but you'll be passing the entire table through this separate
  process that may only need to see 1% of it or less on a large table.
  If you want to write the code and prove it's better than what we have
  now, or some other approach that someone else may implement in the
  meantime, hey, this is an open source project, and I like improvements
  as much as the next guy. ?But my prediction for what it's worth is
  that the results will suck. ?:-)
 
  I will point out that 1% of a very large table can still be a lot of disk
  I/O that is avoided (especially if it's random I/O that's avoided)
 
  Sure, but I think that trying to avoid it will be costly in other ways
  - you'll be streaming a huge volume of data through some auxiliary
  process, which will have to apply some algorithm that's very different
  from the one we use today. ?The reality is that I think there's little
  evidence that the way we do ANALYZE now is too expensive. ?It's
  typically very cheap and works very well. ?It's a bit annoying when it
  fires off in the middle of a giant data load, so we might need to
  change the time of it a little, but if there's a problem with the
  operation itself being too costly, this is the first I'm hearing of
  it. ?We've actually worked *really* hard to make it cheap.
 
  I could be misunderstanding things here, but my understanding is that it's
  'cheap' in that it has little impact on the database while it is running.
 
 I mean that it's cheap in that it usually takes very little time to complete.
 
  the issue here is that the workflow is
 
  load data
  analyze
  start work
 
  so the cost of analyze in this workflow is not 1% impact on query speed for
  the next X time, it's the database can't be used for the next X time while
  we wait for analyze to finish running
 
 OK.
 
  I don't understand why the algorithm would have to be so different than
  what's done today, surely the analyze thread could easily be tweaked to
  ignore the rest of the data (assuming we don't have the thread sending the
  data to analyze do the filtering)
 
 If you want to randomly pick 10,000 rows out of all the rows that are
 going to be inserted in the table without knowing in advance how many
 there will be, how do you do that?  Maybe there's an algorithm, but
 it's not obvious to me.  But mostly, I question how expensive it is to
 have a second process looking at the entire table contents vs. going
 back and rereading a sample of rows at the end.  I can't remember
 anyone ever complaining ANALYZE took too long to run.  I only
 remember complaints of the form I had to remember to manually run it
 and I wish it had just happened by itself.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

Probably doomed to be shot down, but since you are effectively inline,
you could sample assuming a range of table row counts. Start at the
size of a table where random (index) lookups are faster than a sequential
scan and then at appropriate multiples, 100x, 100*100X,... then you should
be able to generate appropriate statistics. I have not actually looked at
how that would happen, but it would certainly allow you to process far, far
fewer rows than the entire table.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Really really slow select count(*)

2011-02-04 Thread Kenneth Marshall

On Fri, Feb 04, 2011 at 05:20:27PM +0100, felix wrote:
 reply was meant for the list

 -- Forwarded message --
 From: felix crucialfe...@gmail.com
 Date: Fri, Feb 4, 2011 at 4:39 PM
 Subject: Re: [PERFORM] Really really slow select count(*)
 To: Greg Smith g...@2ndquadrant.com

 On Fri, Feb 4, 2011 at 3:56 PM, Greg Smith g...@2ndquadrant.com wrote:

  PostgreSQL version?  If you're running on 8.3 or earlier, I would be
  suspicous that your Free Space Map has been overrun.

 8.3

  What you are seeing is that the table itself is much larger on disk than
  it's supposed to be.

 which part of the explain told you that ?

  shaun thomas

 SELECT relpages*8/1024 FROM pg_class
  WHERE relname='fastadder_fastadderstatus';

 458MB

 way too big. build_cache is text between 500-1k chars

As has been suggested, you really need to CLUSTER the table
to remove dead rows. VACUUM will not do that, VACUUM FULL will
but will take a full table lock and then you would need to
REINDEX to fix index bloat. CLUSTER will do this in one shot.
You almost certainly have your free space map way too small,
which is how you bloated in the first place.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-03 Thread Kenneth Marshall

On Thu, Feb 03, 2011 at 02:11:58AM -0800, da...@lang.hm wrote:
 On Thu, 3 Feb 2011, Vitalii Tymchyshyn wrote:

 02.02.11 20:32, Robert Haas ???(??):
 Yeah.  Any kind of bulk load into an empty table can be a problem,
 even if it's not temporary.  When you load a bunch of data and then
 immediately plan a query against it, autoanalyze hasn't had a chance
 to do its thing yet, so sometimes you get a lousy plan.

 May be introducing something like 'AutoAnalyze' threshold will help? I 
 mean that any insert/update/delete statement that changes more then x% of 
 table (and no less then y records) must do analyze right after it was 
 finished.
 Defaults like x=50 y=1 should be quite good as for me.

 If I am understanding things correctly, a full Analyze is going over all 
 the data in the table to figure out patterns.

 If this is the case, wouldn't it make sense in the situation where you are 
 loading an entire table from scratch to run the Analyze as you are 
 processing the data? If you don't want to slow down the main thread that's 
 inserting the data, you could copy the data to a second thread and do the 
 analysis while it's still in RAM rather than having to read it off of disk 
 afterwords.

 this doesn't make sense for updates to existing databases, but the use case 
 of loading a bunch of data and then querying it right away isn't _that_ 
 uncommon.

 David Lang


+1 for in-flight ANALYZE. This would be great for bulk loads of
real tables as well as temp tables.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Server Configuration

2011-02-02 Thread Kenneth Marshall

On Wed, Feb 02, 2011 at 03:15:22PM -0300, Cesar Arrieta wrote:
 Hi, I have a Server with Fedora Core 11, Tomcat and Postgresql 8.3.
 With Hardware:
 * 8GB RAM
 * 8 processors Intel Xeon E5520 @2.27GHz
 * 250GB SATA DISK
 
 Actually, it serves at most 250 connections.
 The problem happends when it serves many many connections at a time, tables
 and queries began to get blocked, then I have to kill some processes
 in order to allow other people continue working.
 
 Wich recommendations could you give me for to configure postgresql.conf, and
 could it be eficcient to buy another server with almost same hardware
 in order to use pgPool2 with replication, load balance and parallel query?.

It sounds like you may just need a connection pooler (pgpool, pgbouncer)
and it might work just fine.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] Slow count(*) again...

2011-02-02 Thread Kenneth Marshall

On Wed, Feb 02, 2011 at 03:54:26PM -0500, Mladen Gogala wrote:
 Greg Smith wrote:
 Given that even Oracle kicked out the RBO a long time ago, I'm not so sure 
 longing for those good old days will go very far.  I regularly see queries 
 that were tweaked to always use an index run at 1/10 or less the speed of 
 a sequential scan against the same data.  The same people complaining all 
 over the place about this topic are also the sort who write them.  There 
 are two main fallacies at play here that make this happen:
   
 Oracle just gives an impression that RBO is gone. It's actually still 
 there, even in 11.2:

 Connected to:
 Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production
 With the Partitioning, OLAP, Data Mining and Real Application Testing 
 options

 SQL alter session set optimizer_mode=rule;

 Session altered.

 Oracle people were just as puritanical as Postgres  people, if not more so. 
 However, the huge backlash made them reconsider the decision. RBO is 
 officially de-supported, obsolete and despised but it is also widely used, 
 even in the Oracle's own SYS schema. Oracle is having huge problems with 
 trying to get people to the cost based optimizer, but they are not yet 
 quite done.


This problem in getting people to migrate to the cost-based optimizer
seems to stem from the original use of the rule based optimizer and
the ability to (mis)hint every option in the DB. If I were running
a shop with 100k-1m lines of SQL code with embedded hints, I would
run screaming at the QA required to move to the cost-based system.
In many ways, the RBO itself + hints is hindering the adoption of
the CBO. Are there any stats on the adoption/use of the CBO on new
Oracle users/shops?

 1) Even if you use an index, PostgreSQL must still retrieve the associated 
 table data to execute the query in order to execute its version of MVCC
   
 Of course. Nobody contests that.  However, index scans for OLTP are 
 indispensable. Sequential scans just don't do the trick in some situations.


 2) The sort of random I/O done by index lookups can be as much as 50X as 
 expensive on standard hard drives as sequential, if every block goes to 
 physical hardware.
   

 Greg, how many questions about queries not using an index have you seen? 
 There is a reason why people keep asking that. The sheer number of 
 questions like that on this group should tell you that there is a problem 
 there. There must be a relatively simple way of influencing optimizer 
 decisions. With all due respect, I consider myself smarter than the 
 optimizer.  I'm 6'4, 235LBS so telling me that you disagree and that I am 
 more stupid than a computer program,  would not be a smart thing to do. 
 Please, do not misunderestimate me.


I see them come up regularly. However, there really are not all that
many when you consider how many people are using PostgreSQL. Its
optimizer works quite well. Knowing how hints can be misused, I would
rather have the developers use their resource to improve the optimizer
than spend time on a hint system that would be mis-used over and over
by beginners, with the attendent posts to HACKERS/PERFORM/NOVICE/...
groups. I certainly have had a fun time or two in my limited Oracle
experience tracking down a hint-based performance problem, so it
works both ways.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] postgres 9 query performance

2011-01-28 Thread Kenneth Marshall

On Fri, Jan 28, 2011 at 09:30:19AM -0800, yazan suleiman wrote:
 I am evaluating postgres 9 to migrate away from Oracle.  The following query
 runs too slow, also please find the explain plan:
 
 
 explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS
 ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN,
 EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE,
 ORIGIN.DEPTH,ORIGIN.EVTYPE,
 ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR, ORIGIN.CONTRIBUTOR OCONTRIBUTOR,
 MAGNITUDE.ID AS MAGID,
 MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE
 from event.event left join event.origin on event.id=origin.eventid left join
 event.magnitude on origin.id=event.magnitude.origin_id
 WHERE EXISTS(select origin_id from event.magnitude where
  magnitude.magnitude=7.2 and origin.id=origin_id)
 order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE desc,EVENT.ID
 ,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID
 
 
 Unique  (cost=740549.86..741151.42 rows=15039 width=80) (actual
 time=17791.557..17799.092 rows=5517 loops=1)
   -  Sort  (cost=740549.86..740587.45 rows=15039 width=80) (actual
 time=17791.556..17792.220 rows=5517 loops=1)
 Sort Key: origin.time, event.magnitude.magnitude, event.id,
 event.preferred_origin_id, origin.id, event.contributor, origin.latitude,
 origin.longitude, origin.depth, origin.evtype, origin.catalog,
 origin.author, origin.contributor, event.magnitude.id, event.magnitude.type
 Sort Method:  quicksort  Memory: 968kB
 -  Nested Loop Left Join  (cost=34642.50..739506.42 rows=15039
 width=80) (actual time=6.927..17769.788 rows=5517 loops=1)
   -  Hash Semi Join  (cost=34642.50..723750.23 rows=14382
 width=62) (actual time=6.912..17744.858 rows=2246 loops=1)
 Hash Cond: (origin.id = event.magnitude.origin_id)
 -  Merge Left Join  (cost=0.00..641544.72 rows=6133105
 width=62) (actual time=0.036..16221.008 rows=6133105 loops=1)
   Merge Cond: (event.id = origin.eventid)
   -  Index Scan using event_key_index on event
  (cost=0.00..163046.53 rows=3272228 width=12) (actual time=0.017..1243.616
 rows=3276192 loops=1)
   -  Index Scan using origin_fk_index on origin
  (cost=0.00..393653.81 rows=6133105 width=54) (actual time=0.013..3033.657
 rows=6133105 loops=1)
 -  Hash  (cost=34462.73..34462.73 rows=14382 width=4)
 (actual time=6.668..6.668 rows=3198 loops=1)
   Buckets: 2048  Batches: 1  Memory Usage: 113kB
   -  Bitmap Heap Scan on magnitude
  (cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414
 rows=3198 loops=1)
 Recheck Cond: (magnitude = 7.2)
 -  Bitmap Index Scan on mag_index
  (cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198
 loops=1)
   Index Cond: (magnitude = 7.2)
   -  Index Scan using mag_fkey_index on magnitude
  (cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2
 loops=2246)
 Index Cond: (origin.id = event.magnitude.origin_id)
 Total runtime: 17799.669 ms
 
 
 This query runs in Oracle in 1 second while takes 16 seconds in postgres,
 The difference tells me that I am doing something wrong somewhere.  This is
 a new installation on a local Mac machine with 12G of RAM.
 
 I have:
 effective_cache_size=4096MB
 shared_buffer=2048MB
 work_mem=100MB

It sounds like the queries are not doing the same thing. What is
the schema/index definition for Oracle versus PostgreSQL?

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Postgres 9.0 has a bias against indexes

2011-01-27 Thread Kenneth Marshall

On Thu, Jan 27, 2011 at 10:41:08AM -0500, Mladen Gogala wrote:
 I have a table EMP, with 14 rows and a description like this:
 scott= \d+ emp
  Table public.emp
   Column  |Type | Modifiers | Storage  | 
 Description
 --+-+---+--+-
  empno| smallint| not null  | plain|
  ename| character varying(10)   |   | extended |
  job  | character varying(9)|   | extended |
  mgr  | smallint|   | plain|
  hiredate | timestamp without time zone |   | plain|
  sal  | double precision|   | plain|
  comm | double precision|   | plain|
  deptno   | smallint|   | plain|
 Indexes:
 emp_pkey PRIMARY KEY, btree (empno)
 emp_mgr_i btree (mgr)
 Foreign-key constraints:
 fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno)
 Has OIDs: no

 scott=

 A recursive query doesn't use existing index on mgr:
 scott= explain analyze
 with recursive e(empno,ename,mgr,bossname,level) as (
 select empno,ename,mgr,NULL::varchar,0 from emp where empno=7839
 union
 select emp.empno,emp.ename,emp.mgr,e.ename,e.level+1
 from emp,e
 where emp.mgr=e.empno)
 select * from e;
  QUERY PLAN
 -
  CTE Scan on e  (cost=20.59..23.21 rows=131 width=78) (actual 
 time=0.020..0.143 rows=14 loops=1)
CTE e
  -  Recursive Union  (cost=0.00..20.59 rows=131 width=52) (actual 
 time=0.018..0.128 rows=14 loops=1)
-  Seq Scan on emp  (cost=0.00..1.18 rows=1 width=10) (actual 
 time=0.013..0.015 rows=1 loops=1)
  Filter: (empno = 7839)
-  Hash Join  (cost=0.33..1.68 rows=13 width=52) (actual 
 time=0.016..0.021 rows=3 loops=4)
  Hash Cond: (public.emp.mgr = e.empno)
  -  Seq Scan on emp  (cost=0.00..1.14 rows=14 width=10) 
 (actual time=0.001..0.004 rows=14 loops=4)
  -  Hash  (cost=0.20..0.20 rows=10 width=44) (actual 
 time=0.004..0.004 rows=4 loops=4)
Buckets: 1024  Batches: 1  Memory Usage: 1kB
-  WorkTable Scan on e  (cost=0.00..0.20 rows=10 
 width=44) (actual time=0.001..0.002 rows=4 loops=4)
  Total runtime: 0.218 ms
 (12 rows)

 scott=

 The optimizer will not use index, not even when I turn off both hash and 
 merge joins. This is not particularly important for a table with 14 rows, 
 but for a larger table, this is a problem. The
 only way to actually force the use of index is by disabling seqscan, but 
 that chooses a wrong path
 again, because it reads the outer table by primary key, which will be 
 very slow. Full table scan,
 done by the primary key is probably the slowest thing around. I know about 
 the PostgreSQL philosophy
 which says hints are bad, and I deeply disagree with it, but would it be 
 possible to have at
 least one parameter that would change calculations in such a way that 
 indexes are favored, where they exist?

 -- 
 Mladen Gogala
 Sr. Oracle DBA
 1500 Broadway
 New York, NY 10036
 (212) 329-5251
 www.vmsinfo.com

Hi Mladen,

PostgreSQL will only use an index if the planner thinks that it
will be faster than the alternative, a sequential scan in this case.
For 14 rows, a sequential scan is 1 read and should actually be
faster than the index. Did you try the query using EXPLAIN ANALYZE
once with index and once without? What were the timings? If they
do not match reality, adjusting cost parameters would be in order.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Slow query + why bitmap index scan??

2011-01-12 Thread Kenneth Marshall

On Wed, Jan 12, 2011 at 03:21:45PM +0100, Laszlo Nagy wrote:
 On 2011-01-12 14:42, Florian Weimer wrote:
 * Laszlo Nagy:

 This query:

 select hid from product_price_history where id=35547581

 Returns 759 rows in 8837 msec! How can this be that slow???
 If most records are on different heap pages, processing this query
 requires many seeks.  11ms per seek is not too bad if most of them are
 cache misses.
 How about this:

 select id,hdate from product_price_history where id=35547581 -- 759 rows, 
 8837 ms
 Query time average: 3 sec.
 Query plan:

 Bitmap Heap Scan on product_price_history  (cost=13.91..1871.34 rows=474 
 width=16)
   Recheck Cond: (id = 35547582)
   -  Bitmap Index Scan on idx_product_price_history_id_hdate  
 (cost=0.00..13.79 rows=474 width=0)
 Index Cond: (id = 35547582)

 Why still the heap scan here? All fields in the query are in the index... 
 Wouldn't a simple index scan be faster? (This is only a theoretical 
 question, just I'm curious.)


Because of PostgreSQL's MVCC design, it must visit each heap tuple
to check its visibility as well as look it up in the index.

 My first idea to speed things up is to cluster this table regularly. That 
 would convert (most of the) rows into a few pages. Few page reads - faster 
 query. Is it a good idea?


Yes, clustering this table would greatly speed up this type of query.

 Another question. Do you think that increasing shared_mem would make it 
 faster?

I doubt it.


 Currently we have:

 shared_mem = 6GB
 work_mem = 512MB
 total system memory=24GB

 Total database size about 30GB, but there are other programs running on the 
 system, and many other tables.

 Thanks,

Laszlo


Clustering is your best option until we get indexes with visibility
information.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL 9.0 x64 bit pgbench TPC very low question?

2010-12-23 Thread Kenneth Marshall

On Thu, Dec 23, 2010 at 09:20:59PM +0700, tuanhoanganh wrote:
 Could you show me what parameter of pgbouncer.ini can do that. I read
 pgbouncer and can not make pgbouncer open and keep 200 connect to postgres
 (Sorry for my English)
 
 Thanks you very much.
 
 Tuan Hoang ANh
 

You need to use session pooling for that to work. From the man page:

   In order not to compromise transaction semantics for connection
   pooling, pgbouncer supports several types of pooling when
   rotating connections:

   Session pooling
   Most polite method. When client connects, a server connection
   will be assigned to it for the whole duration the client
   stays connected. When the client disconnects, the server
   connection will be put back into the pool. This is the
   default method.

   Transaction pooling
   A server connection is assigned to client only during a
   transaction. When PgBouncer notices that transaction is over,
   the server connection will be put back into the pool.

   Statement pooling
   Most aggressive method. The server connection will be put back
   into pool immediately after a query completes. Multi-statement
   transactions are disallowed in this mode as they would break.


The fact that pgbouncer will not keep 200 connections open to
the database means that you do not have enough work to actually
keep 200 permanent connections busy. It is much more efficient
to use transaction pooling. You typically want the number of
persistent database connections to be a small multiple of the
number of CPUs (cores) on your system. Then set pgbouncer to
allow as many client connections as you need. This will give
you the best throughput and pgbouncer can setup and tear down
the connections to your clients much, much faster than making
a full connection to the PostgreSQL database. 

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] CPU bound

2010-12-20 Thread Kenneth Marshall

On Mon, Dec 20, 2010 at 10:33:26AM -0500, James Cloos wrote:
  MG == Mladen Gogala mladen.gog...@vmsinfo.com writes:
 
 MG Good time accounting is the most compelling reason for having a wait
 MG event interface, like Oracle. Without the wait event interface, one
 MG cannot really tell where the time is spent, at least not without
 MG profiling the database code, which is not an option for a production
 MG database.
 
 And how exactly, given that the kernel does not know whether the CPU is
 active or waiting on ram, could an application do so?
 

Exactly. I have only seen this data from hardware emulators. It would
be nice to have...  :)

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Compared MS SQL 2000 to Postgresql 9.0 on Windows

2010-12-07 Thread Kenneth Marshall

On Tue, Dec 07, 2010 at 11:56:51AM -0800, Richard Broersma wrote:
 On Tue, Dec 7, 2010 at 11:43 AM, Andy Colson a...@squeakycode.net wrote:
 
  In PG the first statement you fire off (like an insert into for example)
  will start a transaction. ?If you dont commit before you disconnect that
  transaction will be rolled back. ?Even worse, if your program does not
  commit, but keeps the connection to the db open, the transaction will stay
  open too.
 
 Huh - is this new?  I always thought that every statement was wrapped
 in its own transaction unless you explicitly start your own.  So you
 shouldn't need to commit before closing a connection if you never
 opened a transaction to begin with.
 
 
 -- 
 Regards,
 Richard Broersma Jr.
 

The default of autocommit unless explicitly starting a transaction with
BEGIN is the normal behavior that I have seen as well.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Update problem on large table

2010-12-06 Thread Kenneth Marshall

On Mon, Dec 06, 2010 at 03:24:31PM -0500, Josh Kupershmidt wrote:
 On Mon, Dec 6, 2010 at 2:48 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
  On Mon, Dec 6, 2010 at 1:46 PM, bricklen brick...@gmail.com wrote:
  Not sure if anyone replied about killing your query, but you can do it 
  like so:
 
  select pg_cancel_backend(5902); ?-- assuming 5902 is the pid of the
  query you want canceled.
 
  How does this differ from just killing the pid?
 
 pg_cancel_backend(5902) does the same thing as:
   kill -SIGINT 5902
 
 Josh
 

Yes, but you can use it from within the database. The kill command
requires shell access to the backend.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] SELECT INTO large FKyed table is slow

2010-12-01 Thread Kenneth Marshall

On Wed, Dec 01, 2010 at 12:15:19PM -0500, Mladen Gogala wrote:
 Mario Splivalo wrote:
 I'll try what Pierre suggested, on whole new filesystem. This one did get 
 quite filled with thousands of files that I deleted while the database was 
 working.

  Mario
   

 Yes, that is a good idea. That's the reason why we need a defragmentation 
 tool on Linux. Unfortunately, the only file system that currently has a 
 decent defragmentation tool is XFS and that is a paid option, at least with 
 Red Hat. Greg Smith has recently posted a wonderful review of PostgreSQL on 
 various file systems:

 http://blog.2ndquadrant.com/en/2010/04/the-return-of-xfs-on-linux.html

 There is a operating system which comes with a very decent extent based 
 file system and a defragmentation tool, included in the OS. The file system 
 is called NTFS and company is in the land of Redmond, WA where the 
 shadows lie. One OS to rule them all...

 -- 
 Mladen Gogala Sr. Oracle DBA
 1500 Broadway
 New York, NY 10036
 (212) 329-5251
 http://www.vmsinfo.com The Leader in Integrated Media Intelligence 
 Solutions


Redhat6 comes with ext4 which is an extent based filesystem with
decent performance.

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MVCC performance issue

2010-11-12 Thread Kenneth Marshall

On Fri, Nov 12, 2010 at 03:47:30PM +0200, Kyriacos Kyriacou wrote:
 This is my first post in this mailing list and I would like to raise an
 issue that in my opinion is causing performance issues of PostgreSQL
 especially in a transaction processing environment. In my company we are
 using PostgreSQL for the last 8 year for our in-house developed billing
 system (telecom). The last few months we started considering moving to
 another RDBMS just because of this issue. 
 
 After all these years, I believe that the biggest improvement that could
 be done and will boost overall performance especially for enterprise
 application will be to improve Multiversion Concurrency Control (MVCC)
 mechanism. In theory this seems to be improving performance for SELECT
 queries but on tables with very intensive and frequent updates, even
 that is not fully true because of the fragmentation of data caused by
 MVCC. I saw cases were a SELECT COUNT(*) on an empty (!!!) table (used
 as a buffer) took more than 40min to return a result! VACUUM is not a
 solution in my opinion even though after the introduction of autovacuum
 daemon situation got much better.
 
 PROBLEM DECRIPTION
 --
 By definition of MVCC, when an UPDATE is performed, PostgreSQL creates a
 new copy of the row in a new location. Any SELECT queries within the
 same session are accessing the new version of the raw and all other
 queries from other users are still accessing the old version. When
 transaction is COMMIT PostgreSQL makes the a new version of the row as
 the active row and expires the old row that remains dead and then is
 up to VACUUM procedure to recover the dead rows space and make it
 available to the database engine. In case that transaction is ROLLBACK
 then space reserved for the new version of the row is released. The
 result is to have huge fragmentation on table space, unnecessary updates
 in all affected indexes, unnecessary costly I/O operations, poor
 performance on SELECT that retrieves big record sets (i.e. reports etc)
 and slower updates. As an example, consider updating the live balance
 of a customer for each phone call where the entire customer record has
 to be duplicated again and again upon each call just for modifying a
 numeric value! 
 
 SUGGESTION
 --
 1) When a raw UPDATE is performed, store all new raw versions either
 in separate temporary table space 
or in a reserved space at the end of each table (can be allocated
 dynamically) etc 
 2) Any SELECT queries within the same session will be again accessing
 the new version of the row
 3) Any SELECT queries from other users will still be accessing the old
 version
 4) When UPDATE transaction is ROLLBACK just release the space used in
 new temporary location  
 5) When UPDATE transaction is COMMIT then try to LOCK the old version
 and overwrite it at the same physical location (NO FRAGMENTATION).
 6) Similar mechanism can be applied on INSERTS and DELETES  
 7) In case that transaction was COMMIT, the temporary location can be
 either released or archived/cleaned on a pre-scheduled basis. This will
 possibly allow the introduction of a TRANSACTION LOG backup mechanism as
 a next step. 
 8) After that VACUUM will have to deal only with deletions!!! 
 
 
 I understand that my suggestion seems to be too simplified and also that
 there are many implementation details and difficulties that I am not
 aware. 
 
 I strongly believe that the outcome of the discussion regarding this
 issue will be helpful. 
 
 Best Regards, 
 
 Kyriacos Kyriacou
 Senior Developer/DBA
 

I cannot speak to your suggestion, but it sounds like you are not
vacuuming enough and a lot of the bloat/randomization would be helped
by making use of HOT updates in which the updates are all in the same
page and are reclaimed almost immediately.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MVCC performance issue

2010-11-12 Thread Kenneth Marshall

On Fri, Nov 12, 2010 at 07:34:36AM -0800, bricklen wrote:
 On Fri, Nov 12, 2010 at 5:52 AM, Kenneth Marshall k...@rice.edu wrote:
 
  I cannot speak to your suggestion, but it sounds like you are not
  vacuuming enough and a lot of the bloat/randomization would be helped
  by making use of HOT updates in which the updates are all in the same
  page and are reclaimed almost immediately.
 
  Regards,
  Ken
 
 IIRC, HOT only operates on non-indexed columns, so if you the tables
 are heavily indexed you won't get the full benefit of HOT. I could be
 wrong though.
 

That is true, but if they are truly having as big a bloat problem
as the message indicated, it would be worth designing the schema
to leverage HOT for the very frequent updates.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] MVCC performance issue

2010-11-12 Thread Kenneth Marshall

Ah, this is a very old version. If you can take advantage of
a version with HOT support, you should be much, much happier.

Cheers,
Ken

On Fri, Nov 12, 2010 at 06:14:00PM +0200, Kyriacos Kyriacou wrote:
  
 
 We are still using PostgreSQL 8.2.4. We are running a 24x7 system and
 database size is over 200Gb so upgrade is not an easy decision! 
 
 I have it in my plans so in next few months I will setup new servers and
 upgrade to version 9. 
 
 
  Which version of PostgreSQL are you basing this on?
 
 
 -- 
 Thom Brown
 Twitter: @darkixion
 IRC (freenode): dark_ixion
 Registered Linux user: #516935
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall

On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote:
 On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Robert Haas robertmh...@gmail.com wrote:
  Unfortunately, to know how much data we're going to grovel
  through, we need to know the plan; and to decide on the right
  plan, we need to know how much data we're going to grovel through.
 
  And that's where they've been ending.
 
  The only half-sane answer I've thought of is to apply a different
  cost to full-table or full-index scans based on the ratio with
  effective cache size.
 
 Kevin, yes, good point.  Bravo!  Let's do that.  Details TBD, but
 suppose effective_cache_size = 1GB.  What we know for sure is that a 4
 GB table is not going to be fully cached but a 4 MB table may well be.
  In fact, I think we should assume that the 4 MB table IS cached,
 because the point is that if it's used at all, it soon will be.  It's
 almost certainly a bad idea to build a plan around the idea of
 minimizing reads from that 4 MB table in favor of doing a substantial
 amount of additional work somewhere else.  I suppose this could break
 down if you had hundreds and hundreds of 4 MB tables all of which were
 accessed regularly, but that's an unusual situation, and anyway it's
 not clear that assuming them all uncached is going to be any better
 than assuming them all cached.
 
  This might have some connection to some rather half-baked ideas I've
  been having in connection with the generalized-inner-indexscan problem.
  I don't have anything in the way of a coherent presentation to make yet,
  but the thing I'm being forced to realize is that sane modeling of a
  complex subplan that's on the inside of a nestloop join requires
  treating *every* scan type as having different costs the first time
  versus during rescan. ?If the total amount of data touched in the
  query is less than effective_cache_size, it's not unreasonable to
  suppose that I/O costs during rescan might be zero, even for a seqscan or
  a non-parameterized indexscan. ?In fact, only parameterized indexscans
  would be able to touch pages they'd not touched the first time, and so
  they ought to have higher not lower rescan costs in this environment.
  But once the total data volume exceeds effective_cache_size, you have to
  do something different since you shouldn't any longer assume the data is
  all cached from the first scan. ?(This isn't quite as hard as the case
  you're talking about, since I think the relevant data volume is the sum
  of the sizes of the tables used in the query; which is easy to
  estimate at the start of planning, unlike the portion of the tables
  that actually gets touched.)
 
 Well, we don't want the costing model to have sharp edges.
 effective_cache_size can't be taken as much more than an educated
 guess, and what actually happens will depend a lot on what else is
 going on on the system.  If only one query is running on a system at a
 time and it is repeatedly seq-scanning a large table, the cost of
 reading pages in will be very small until the table grows large enough
 that you can't fit the whole thing in memory at once, and then will
 abruptly go through the roof.  But realistically you're not going to
 know exactly where that edge is going to be, because you can't predict
 exactly how much concurrent activity there will be, for example, or
 how much work_mem allocations will push out of the OS buffer cache.
 So I'm thinking we should start the costs at something like 0.05/0.05
 for tables that are much smaller than effective_cache_size and ramp up
 to 4/1 for tables that are larger than effective_cache_size.  Maybe
 just by linearly ramping up, although that has a certain feeling of
 being without mathemetical soundness.
 
  An idea that isn't even half-baked yet is that once we had a cost model
  like that, we might be able to produce plans that are well-tuned for a
  heavily cached environment by applying the rescan cost model even to
  the first scan for a particular query. ?So that might lead to some sort
  of assume_cached GUC parameter, and perhaps Kevin could tune his
  reporting queries by turning that off instead of messing with individual
  cost constants.
 
 I think the real goal here should be to try to avoid needing a GUC.  A
 lot of people could benefit if the system could make some attempt to
 recognize on its own which queries are likely to be cached.  We
 already have parameters you can hand-tune for each query as necessary.
  Being able to set some parameters system-wide and then get sensible
 behavior automatically would be much nicer.
 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 

I agree with the goal of avoiding the need for a GUC. This needs to
be as automatic as possible. One idea I had had was computing a value
for the amount of cache data in the system by keeping a sum or a
weighted sum of the

Re: [PERFORM] CREATE INDEX as bottleneck

2010-11-11 Thread Kenneth Marshall

On Thu, Nov 11, 2010 at 02:41:12PM +0100, Marc Mamin wrote:
 Hello,
 
 in the last years, we have successfully manage to cope with our data
 growth 
 using partitioning and splitting large aggregation tasks on multiple
 threads.
 The partitioning is done logically by our applicationn server, thus
 avoiding trigger overhead.
 
 There are a few places in our data flow where we have to wait for index
 creation before being able to distribute the process on multiple threads
 again.
 
 With the expected growth, create index will probably become a severe
 bottleneck for us.
 
 Is there any chance to see major improvement on it in a middle future ?
 I guess the question is naive, but why can't posgres use multiple
 threads for large sort operation ?
 
 
 best regards,
 
 Marc Mamin
 

There has been a recent discussion on the hackers mailing list on
using the infrastructure that is already in place to lauch autovacuum
processes to launch other helper processes. Something like this could
be used to offload the sort process to a much more parallelize version
that could take advantage of multiple I/O streams and CPU cores. Many
things are possible given the appropriate resources: funding, coding
and development cycles...

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall

On Thu, Nov 11, 2010 at 09:15:58AM -0500, Mladen Gogala wrote:
 Kenneth Marshall wrote:
 I agree with the goal of avoiding the need for a GUC. This needs to
 be as automatic as possible. One idea I had had was computing a value
 for the amount of cache data in the system by keeping a sum or a
 weighted sum of the table usage in the system. Smaller tables and
 indexes would contribute a smaller amount to the total, while larger
 indexes and tables would contribute a larger amount. Then by comparing
 this running total to the effective_cache_size, set the random and
 sequential costs for a query. This would allow the case of many 4MB
 tables to favor disk I/O more than memory I/O. The weighting could
 be a function of simultaneous users of the table. I know this is a
 bit of hand-waving but some sort of dynamic feedback needs to be
 provided to the planning process as system use increases.

 Regards,
 Ken

   
 Kenneth, you seem to be only concerned with the accuracy of the planning 
 process, not with the plan stability. As a DBA who has to monitor real 
 world applications, I find things like an execution plan changing with the 
 use of the system to be my worst nightmare. The part where you say that 
 this needs to be as automatic as possible probably means that I will not 
 be able to do anything about it, if the optimizer, by any chance, doesn't 
 get it right. That looks to me like an entirely wrong way to go.
 When application developer tunes the SQL both him and me expect that SQL to 
 always perform that way, not to change the execution plan because the 
 system is utilized more than it was 1 hour ago. Nobody seems to have taken 
 my suggestion about having a parameter
 which would simply invent the percentage out of thin air seriously, 
 because it's obviously not accurate.
 However, the planner accuracy is not the only concern. Running applications 
 on the system usually requires plan stability. Means of
 external control of the execution plan, DBA knobs and buttons that can be 
 turned and pushed to produce the desired plan are also very much desired.

 -- 
 Mladen Gogala Sr. Oracle DBA
 1500 Broadway
 New York, NY 10036
 (212) 329-5251
 www.vmsinfo.com 

Hi Mladen,

I think in many ways, this is the same problem. Because we are not
correctly modeling the system, the plan choices are not accurate
either for some scenarios. This means that when plan costs are
compared, the evaluation is not accurate. This is what causes the
terrible plans being right next to the good plans and is what
impacts the plan stability. If the costs are correct, then in
fact the plan stability will be much better with the better
costing, not worse. Plans with close costs should actually have
close performance.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] anti-join chosen even when slower than old plan

2010-11-11 Thread Kenneth Marshall

On Thu, Nov 11, 2010 at 03:56:25PM -0500, gnuo...@rcn.com wrote:
 On a thread some time ago, on a similar subject, I opined that I missed the 
 ability to assign tables to tablespaces and buffers to tablespaces, thus 
 having the ability to isolate needed tables (perhaps a One True Lookup Table, 
 for example; or a Customer table) to memory without fear of eviction.
 
 I was sounding beaten about the face and breast.  It really is an 
 Enterprise way of handling the situation.
 
 regards,
 Robert
 

ALTER TABLE can be used to change the tablespace of a table
and/or index.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Bufer cache replacement LRU algorithm?

2010-11-03 Thread Kenneth Marshall

Mladen,

You would need to check the mailing lists. The release notes
have it as being a clock sweep algorithm starting in version
8. Then additional changes were added to eliminate the cache
blowout caused by a sequential scan and by vacuum/autovacuum.
I do not believe that there are any parameters available other
than total size of the pool and whether sequential scans are
synchronized.

Regards,
Ken

On Wed, Nov 03, 2010 at 12:35:33PM -0400, Mladen Gogala wrote:
 Where can I find the documentation describing the buffer replacement 
 policy? Are there any parameters governing the page replacement policy?

 -- 
 Mladen Gogala Sr. Oracle DBA
 1500 Broadway
 New York, NY 10036
 (212) 329-5251
 http://www.vmsinfo.com The Leader in Integrated Media Intelligence 
 Solutions




 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Select count(*), the sequel

2010-10-28 Thread Kenneth Marshall

On Wed, Oct 27, 2010 at 05:49:42PM -0400, Tom Lane wrote:
 Kenneth Marshall k...@rice.edu writes:
  Just keeping the hope alive for faster compression.
 
 Is there any evidence that that's something we should worry about?
 I can't recall ever having seen a code profile that shows the
 pg_lzcompress.c code high enough to look like a bottleneck compared
 to other query costs.
 
 Now, the benefits of 2X or 3X space savings would be pretty obvious,
 but I've seen no evidence that we could easily achieve that either.
 
   regards, tom lane
 

One use is to allow substr() on toasted values without needing to
decompress the entire contents. Another possibility is to keep
larger fields compressed in memory for some value of larger.
With faster compression, it might by useful to compress the WAL
files to support faster data rates and therefore update rates
for the same hardware. And there are always the in page common
substring storage optimizations to reduce index/table sizes.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Select count(*), the sequel

2010-10-27 Thread Kenneth Marshall

On Wed, Oct 27, 2010 at 09:52:49PM +0200, Pierre C wrote:
 Even if somebody had a
 great idea that would make things smaller without any other penalty,
 which I'm not sure I believe either.

 I'd say that the only things likely to bring an improvement significant 
 enough to warrant the (quite large) hassle of implementation would be :

 - read-only / archive tables (get rid of row header overhead)
 - in-page compression using per-column delta storage for instance (no 
 random access penalty, but hard to implement, maybe easier for read-only 
 tables)
 - dumb LZO-style compression (license problems, needs parallel 
 decompressor, random access penalty, hard to implement too)


Different algorithms have been discussed before. A quick search turned
up:

quicklz - GPL or commercial
fastlz - MIT works with BSD okay
zippy - Google - no idea about the licensing
lzf - BSD-type
lzo - GPL or commercial
zlib - current algorithm

Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X
and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same
comparison put lzo at 3.0X for compression and 1.8X decompress. The block
design of lzl/fastlz may be useful to support substring access to toasted
data among other ideas that have been floated here in the past.

Just keeping the hope alive for faster compression.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Periodically slow inserts

2010-10-21 Thread Kenneth Marshall

Hi,

There are a lot of details missing about your system:

http://wiki.postgresql.org/wiki/SlowQueryQuestions

Cheers,
Ken

On Thu, Oct 21, 2010 at 02:25:44PM +0200, Gael Le Mignot wrote:
 
 Hello,
 
 We are  using PostgreSQL for  storing data and full-text  search indexes
 for the webiste of a daily newspaper. We are very happy overall with the
 results, but we have one weird behaviour that we would like to solve.
 
 ...

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Select count(*), the sequel

2010-10-16 Thread Kenneth Marshall

Hi,

Interesting data points. The amount of rows that you managed to
insert into PostgreSQL before Oracle gave up the ghost is 95%
of the rows in the Oracle version of the database. To count 5%
fewer rows, it took PostgreSQL 24 seconds longer. Or adjusting
for the missing rows, 52 seconds longer for the entire table
or 18% longer than the full table scan in Oracle. This seems to
be well within the table layout size differences, possibly due
to the fillfactor used --not really bad at all. Now the timings
due to algorithm changes are interesting as indicating the room
for improvement due to those type of changes. A parallel sequential
full-table scan in PostgreSQL could provide the same speed up.
Currently that is not possible ... but development continues a
pace...

In fact, developing such functions in PostgreSQL could end up
being less expensive long-term than licensing Oracle RAC. I think
the point that you have helped make is that PostgreSQL performs
very well for many use cases that have typically been relegated
to expensive commecial databases such as Oracle, DB2,...

Regards,
Ken

On Sat, Oct 16, 2010 at 12:53:50PM -0400, Mladen Gogala wrote:
 There was some doubt as for the speed of doing the select count(*) in 
 PostgreSQL and Oracle.
 To that end, I copied the most part of the Oracle table I used before to 
 Postgres. Although the copy
 wasn't complete, the resulting table is already significantly larger than 
 the table it was copied from. The result still shows that Oracle is 
 significantly faster:
 Oracle result:

 SQL alter system flush buffer_cache;

 System altered.

 SQL select /*+ full(NO) noparallel */ count(*) from ni_occurrence no;

   COUNT(*)
 --
  402062638

 Elapsed: 00:03:16.45



 Hints are necessary because Oracle table is declared as parallel and I 
 didn't want the PK index to be used for counting. Oracle has a good habit 
 of using PK's for counting, if available.


 SQL select bytes/1048576 as MB
   2  from user_segments
   3  where segment_name='NI_OCCURRENCE';

 MB
 --
  35329

 Elapsed: 00:00:00.85
 SQL

 So, oracle stores 402 million records in 35GB and counts them in 3 minutes 
 16.45 seconds  The very same table was partially copied to Postgres, 
 copying died with ORA-01555 snapshot too old sometimes this morning. I ran 
 vacuumdb -f -z on the database after the copy completed and the results are 
 below.

 mgogala=# select count(*) from ni_occurrence;
count
 ---
  382400476
 (1 row)

 Time: 221716.466 ms
 mgogala=#
 mgogala=# select 221/60::real;
  ?column?
 --
  3.68
 (1 row)

 Time: 0.357 ms
 mgogala=#
 mgogala=# select pg_size_pretty(pg_table_size('ni_occurrence'));
  pg_size_pretty
 
  46 GB
 (1 row)

 Time: 0.420 ms
 mgogala=#

 The database wasn't restarted, no caches were flushed, the comparison was 
 done with a serious advantage for PostgreSQL. Postgres needed 3.68 minutes 
 to complete the count which is about the same Oracle but still somewhat 
 slower. Also, I am worried about the sizes. Postgres table is 11GB larger 
 than the original, despite having less data. That was an unfair and 
 unbalanced comparison because Oracle's cache was flushed and Oracle was 
 artificially restrained to use the full table scan without the aid of 
 parallelism. Here is the same result, with no hints and the autotrace on, 
 which shows what happens if I turn the hints off:

 SQL select count(*) from ni_occurrence no;

   COUNT(*)
 --
  402062638

 Elapsed: 00:00:52.61

 Execution Plan
 --
 Plan hash value: 53476935

 
 

 | Id  | Operation  | Name  | Rows  | Cost (%CPU)|
  Time  |TQ |IN-OUT| PQ Distrib |

 
 

 |   0 | SELECT STATEMENT  |  |1 | 54001  (19)|
  00:01:08 |   |  |   |

 |   1 |  SORT AGGREGATE   |  |1 |   |
   |   |  |   |

 |   2 |   PX COORDINATOR  |  |  |   |
   |   |  |   |

 |   3 |PX SEND QC (RANDOM)  | :TQ1  |1 |   
 |
   |  Q1,00 | P-S | QC (RAND)  |

 |   4 | SORT AGGREGATE  |  |1 |   |
   |  Q1,00 | PCWP |   |

 |   5 |  PX BLOCK ITERATOR  |  |   402M| 54001  (19)|
  00:01:08 |  Q1,00 | PCWC |   |

 |   6 |   INDEX FAST FULL SCAN| IDX_NI_OCCURRENCE_PID |   402M| 54001  
 (19)|
  00:01:08 |  Q1,00 | PCWP |   |

 
 

 It took just 52 seconds to count everything, but Oracle didn't

Re: [PERFORM] Query much faster with enable_seqscan=0

2010-09-21 Thread Kenneth Marshall

You DB is more than likely cached. You should adjust your
page costs to better reflect reality and then the planner
can make more accurate estimates and then choose the proper
plan.

Cheers,
Ken

On Tue, Sep 21, 2010 at 12:32:01PM -0500, Ogden wrote:
 Hello,
 
 I have received some help from the IRC channel, however, the problem still 
 exists. When running the following query with enable_seqscan set to 0, it 
 takes less than a second, whereas with it set to 1, the query returns in 14 
 seconds. The machine itself has 8GB Ram and is running PostgreSQL 9.0 on 
 Debian Lenny. The database size is about 7GB. 
 
 
 Query:
 SELECT tr.id, tr.sid
 FROM
 test_registration tr,
 INNER JOIN test_registration_result r on (tr.id = 
 r.test_registration_id)
 WHERE.
 
 tr.test_administration_id='32a22b12-aa21-11df-a606-96551e8f4e4c'::uuid
 GROUP BY tr.id, tr.sid
 
 
 
 demo=# \d test_registration
Table public.test_registration
  Column |Type |   Modifiers   
  
 +-+
  id | uuid| not null
  sid| character varying(36)   | not null
  created_date   | timestamp without time zone | not null default now()
  modified_date  | timestamp without time zone | not null
  test_administration_id | uuid| not null
  teacher_number | character varying(15)   | 
  test_version_id| uuid| 
 Indexes:
 test_registration_pkey PRIMARY KEY, btree (id)
 test_registration_sid_key UNIQUE, btree (sid, test_administration_id)
 test_registration_teacher btree (teacher_number)
 test_registration_test_id btree (test_administration_id)
 
 demo=# \d test_registration_result
  Table public.test_registration_result
 Column| Type  | Modifiers 
 --+---+---
  answer   | character varying(15) | 
  question_id  | uuid  | not null
  score| double precision  | 
  test_registration_id | uuid  | not null
 Indexes:
 test_registration_result_pkey PRIMARY KEY, btree (question_id, 
 test_registration_id)
 test_registration_result_answer btree (test_registration_id, answer, 
 score)
 test_registration_result_test btree (test_registration_id)
 
 
 Explain Analyze:
 
 
 --
  HashAggregate  (cost=951169.97..951198.37 rows=2840 width=25) (actual 
 time=14669.039..14669.843 rows=2972 loops=1)
-  Hash Join  (cost=2988.07..939924.85 rows=2249024 width=25) (actual 
 time=551.464..14400.061 rows=638980 loops=1)
  Hash Cond: (r.test_registration_id = tr.id)
  -  Seq Scan on test_registration_result r  (cost=0.00..681946.72 
 rows=37199972 width=16) (actual time=0.015..6073.101 rows=37198734 loops=1)
  -  Hash  (cost=2952.57..2952.57 rows=2840 width=25) (actual 
 time=2.516..2.516 rows=2972 loops=1)
Buckets: 1024  Batches: 1  Memory Usage: 160kB
-  Bitmap Heap Scan on test_registration tr  
 (cost=44.29..2952.57 rows=2840 width=25) (actual time=0.528..1.458 rows=2972 
 loops=1)
  Recheck Cond: (test_administration_id = 
 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid)
  -  Bitmap Index Scan on 
 test_registration_test_administration_id  (cost=0.00..43.58 rows=2840 
 width=0) (actual time=0.507..0.507 rows=2972 loops=1)
Index Cond: (test_administration_id = 
 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid)
  Total runtime: 14670.337 ms
 (11 rows)
 
 
 real  0m14.698s
 user  0m0.000s
 sys   0m0.008s
 
 
 With set enable_seqscan=0;
 
 
 SET
   
QUERY PLAN 
  
 -
  HashAggregate  (cost=1225400.19..1225428.59 rows=2840 width=25) (actual 
 time=748.397..749.160 rows=2972 loops=1)
-  Nested Loop  (cost=0.00..1214155.07 rows=2249024 width=25) (actual 
 time=0.107..465.165 rows=638980 loops=1)
  -  Index Scan using test_registration_test_administration_id on 
 test_registration tr  (cost=0.00..4413.96 rows=2840 width=25) (actual 
 time=0.050..1.610 rows=2972 loops=1)
Index Cond: (test_administration_id = 
 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid)
  -  Index Scan using

Re: [PERFORM] now() gives same time within the session

2010-07-12 Thread Kenneth Marshall

On Mon, Jul 12, 2010 at 06:11:31AM -0700, Rob Wultsch wrote:
 On Mon, Jul 12, 2010 at 4:15 AM, A. Kretschmer
 andreas.kretsch...@schollglas.com wrote:
  In response to atul.g...@globaldatapoint.com :
  Hi,
 
 
 
  I need to log the start and end time of the procedures in a table. But the
  start and end time are same. This is how I recreated the issue.
 
 
 
  create table test_time (time timestamp);
 
  delete from ?test_time;
 
  insert into test_time select now();
 
 
  Use timeofday() instead, now() returns the transaction starting time.
 
 
 Is this part of the SQL standard?
 
No, see section 9.9.4 of the manual.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Highly Efficient Custom Sorting

2010-07-07 Thread Kenneth Marshall

Hi Eliot,

Would you mind posting your code for reference. It is nice to
have working examples when trying to figure out how it all fits
together.

Regards,
Ken

On Wed, Jul 07, 2010 at 03:23:12PM -0400, Eliot Gable wrote:
 Thanks again for all the input and suggestions from people. I have this
 sorting algorithm re-implemented in C now and it is somewhere 2ms to run it
 now; though it is difficult to get a more accurate measure. There may be
 some additional optimizations I can come up with, but for now, this will
 work very well compared to the alternative methods.
 
 On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  Eliot Gable 
  egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com
  writes:
   Do I need to somehow force the server to unload and then re-load this .so
   file each time I build a new version of it? If so, how do I do that?
 
  Start a new database session.
 
 regards, tom lane
 
 
 
 
 -- 
 Eliot Gable
 
 We do not inherit the Earth from our ancestors: we borrow it from our
 children. ~David Brower
 
 I decided the words were too conservative for me. We're not borrowing from
 our children, we're stealing from them--and it's not even considered to be a
 crime. ~David Brower
 
 Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not
 live to eat.) ~Marcus Tullius Cicero

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Write performance

2010-06-24 Thread Kenneth Marshall

On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote:
 Hi,
 
 at the moment we encounter some performance problems with our database server.
 
 We have a 12 GB RAM machine with intel i7-975 and using
 3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB) 
 One disk for the system and WAL etc. and one SW RAID-0 with two disks for  
 postgresql data. Our database is about 24GB.
 
 Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and 
 reads of about 1000 blocks per second on our disk which holds the data 
 directories of postgresql (WAL are on a different disk)
 
 3000 blocks ~ about 3 MB/s write
 1000 blocks ~ about 1 MB/s read
 
 At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load 
 (so 4 of 8 cpu cores are in use for io wait)
 
 We know, its a poor man disk setup (but we can not find a hoster with rather 
 advanced disk configuration at an affordable price). Anyway, we ran some 
 tests 
 on it:
 
 
 # time sh -c dd if=/dev/zero of=bigfile bs=8k count=300  sync
 300+0 records in
 300+0 records out
 2457600 bytes (25 GB) copied, 276.03 s, 89.0 MB/s
 
 real  4m48.658s
 user  0m0.580s
 sys   0m51.579s
 
 # time dd if=bigfile of=/dev/null bs=8k
 300+0 records in
 300+0 records out
 2457600 bytes (25 GB) copied, 222.841 s, 110 MB/s
 
 real  3m42.879s
 user  0m0.468s
 sys   0m18.721s
 
 
 
 Of course, writing large chunks is quite a different usage pattern. But I am 
 wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can 
 run a test with 89 MB/s writing and 110MB/s reading.
 
 Can you give some hints, if this numbers seems to be reasonable? 
 
 kind regards
 Janning
 

Yes, these are typical random I/O versus sequential I/O rates for
hard drives. Your I/O is extremely under-powered relative to your
CPU/memory. For DB servers, many times you need much more I/O
instead.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] requested shared memory size overflows size_t

2010-06-18 Thread Kenneth Marshall

On Fri, Jun 18, 2010 at 12:46:11AM +0100, Tom Wilcox wrote:
 On 17/06/2010 22:41, Greg Smith wrote:
 Tom Wilcox wrote:
 Any suggestions for good monitoring software for linux?

 By monitoring, do you mean for alerting purposes or for graphing purposes? 
  Nagios is the only reasonable choice for the former, while doing at best 
 a mediocre job at the latter.  For the later, I've found that Munin does a 
 good job of monitoring Linux and PostgreSQL in its out of the box 
 configuration, in terms of providing useful activity graphs.  And you can 
 get it to play nice with Nagios.

 Thanks Greg. Ill check Munin and Nagios out. It is very much for graphing 
 purposes. I would like to be able to perform objective, 
 platform-independent style performance comparisons.

 Cheers,
 Tom

Zabbix-1.8+ is also worth taking a look at and it can run off our
favorite database. It allows for some very flexible monitoring and
trending data collection.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] slow query performance

2010-06-11 Thread Kenneth Marshall

Hi Anj,

That is an indication that your system was less correctly
modeled with a random_page_cost=2 which means that the system
will assume that random I/O is cheaper than it is and will
choose plans based on that model. If this is not the case,
the plan chosen will almost certainly be slower for any
non-trivial query. You can put a 200mph speedometer in a
VW bug but it will never go 200mph.

Regards,
Ken

On Thu, Jun 10, 2010 at 07:54:01PM -0700, Anj Adu wrote:
 I changed random_page_cost=4 (earlier 2) and the performance issue is gone
 
 I am not clear why a page_cost of 2 on really fast disks would perform badly.
 
 Thank you for all your help and time.
 
 On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu fotogra...@gmail.com wrote:
  Attached
 
  Thank you
 
 
  On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas robertmh...@gmail.com wrote:
  On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu fotogra...@gmail.com wrote:
  The plan is unaltered . There is a separate index on theDate as well
  as one on node_id
 
  I have not specifically disabled sequential scans.
 
  Please do SHOW ALL and attach the results as a text file.
 
  This query performs much better on 8.1.9 on a similar sized
  table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 )
 
  Well that could certainly matter...
 
  --
  Robert Haas
  EnterpriseDB: http://www.enterprisedb.com
  The Enterprise Postgres Company
 
 
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Need to increase performance of a query

2010-06-10 Thread Kenneth Marshall

On Thu, Jun 10, 2010 at 12:34:07PM -0700, Anne Rosset wrote:
 Jochen Erwied wrote:
 Thursday, June 10, 2010, 8:36:08 PM you wrote:

   
 psrdb=# (SELECT
 psrdb(#MAX(item_rank.rank) AS maxRank
 psrdb(# FROM
 psrdb(#item_rank item_rank
 psrdb(# WHERE
 psrdb(#item_rank.project_id='proj2783'
 psrdb(# AND item_rank.pf_id IS NULL
 psrdb(#
 psrdb(# )
 psrdb-# ORDER BY
 psrdb-# maxRank DESC;
 

 Don't think it does really matter, but why do you sort a resultset 
 consisting of only one row?

   
 Sorry, I should have removed the ORDER by (the full query has a union).
 So without the ORDER by, here are the results:
 psrdb=# SELECT
 psrdb-#MAX(item_rank.rank) AS maxRank
 psrdb-# FROM
 psrdb-#item_rank item_rank
 psrdb-# WHERE
 psrdb-#item_rank.pf_id='plan1408';
   maxrank
 -
 2050400
 (1 row)

 Time: 1.516 ms
 psrdb=# SELECT
 psrdb-#MAX(item_rank.rank) AS maxRank
 psrdb-# FROM
 psrdb-#item_rank item_rank
 psrdb-# WHERE
 psrdb-#item_rank.project_id='proj2783'
 psrdb-# AND item_rank.pf_id IS NULL;
   maxrank
 -
 202
 (1 row)

 Time: 13.177 ms

 Is there anything that can be done for the second one?

 Thanks,
 Anne

What about an IS NULL index on pf_id?

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete

2010-05-07 Thread Kenneth Marshall

On Fri, May 07, 2010 at 09:37:42AM -0400, Mark Stosberg wrote:
 
 Hello,
 
 We've been a satified user of PostgreSQL for several years, and use it
 to power a national pet adoption website: http://www.adoptapet.com/
 
 Recently we've had a regularly-timed middle-of-the-night problem where
 database handles are exhausted for a very brief period.
 
 In tracking it down, I have found that the event seems to correspond to
 a time when a cron script is deleting from a large logging table, but
 I'm not certain if this is the cause or a correlation.
 
 We are deleting about 5 million rows from a time-based logging table
 that is replicated by Slony. We are currently using a single delete
 statement, which takes about 15 minutes to run. There is no RI on the
 table, but the use of Slony means that a trigger call and action is made
 for every row deleted, which causes a corresponding insertion in another
 table so the deletion can be replicated to the slave.
 
 My questions:
 
 - Could this kind of activity lead to an upward spiral in database
   handle usage?
Yes.
 
 - Would it be advisable to use several small DELETE statements instead,
   to delete rows in batches of 1,000. We could use the recipe for this
   that was posted earlier to this list:
Yes, that is the method we use in several cases to avoid this behavior.
Deletion is a more intensive process in PostgreSQL, so batching it will
keep from dragging down other queries which results in your out-of-handles
error.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] autovacuum strategy / parameters

2010-04-28 Thread Kenneth Marshall

Check out the manual:

http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html#AUTOVACUUM

Cheers,
Ken

On Wed, Apr 28, 2010 at 10:37:35AM -0400, akp geek wrote:
 Hi -
don't want to side track the discussion. We have 8.4, which of
 AUTOVACUUM PARAMETERS can be set to handle individual table?  I ran into
 bloat with small table only. Now the issue is being resolved.
 
 Regards
 On Wed, Apr 28, 2010 at 10:20 AM, Thomas Kellerer spam_ea...@gmx.netwrote:
 
  Rick, 22.04.2010 22:42:
 
 
  So, in a large table, the scale_factor is the dominant term. In a
  small table, the threshold is the dominant term. But both are taken into
  account.
 
  The default values are set for small tables; it is not being run for
  large tables.
 
 
  With 8.4 you can adjust the autovacuum settings per table...
 
 
 
 
 
  --
  Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] mysql to postgresql, performance questions

2010-03-18 Thread Kenneth Marshall

If you expect this DB to be memory resident, you should update
the cpu/disk cost parameters in postgresql.conf. There was a
post earlier today with some more reasonable starting values.
Certainly your test DB will be memory resident.

Ken

On Thu, Mar 18, 2010 at 03:31:18PM +0100, Corin wrote:
 Hi all,

 I'm running quite a large social community website (250k users, 16gb 
 database). We are currently preparing a complete relaunch and thinking 
 about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database 
 server is a dual dualcore operton 2216 with 12gb ram running on debian 
 amd64.

 For a first impression I ran a simple query on our users table (snapshot 
 with only ~ 45.000 records). The table has an index on birthday_age 
 [integer]. The test executes 10 times the same query and simply discards 
 the results. I ran the tests using a php and a ruby script, the results are 
 almost the same.

 Unluckily mysql seems to be around 3x as fast as postgresql for this simple 
 query. There's no swapping, disc reading involved...everything is in ram.

 query
 select * from users where birthday_age between 12 and 13 or birthday_age 
 between 20 and 22 limit 1000

 mysql
 {select_type=SIMPLE, key_len=1, id=1, table=users, 
 type=range, possible_keys=birthday_age, rows=7572, 
 Extra=Using where, ref=nil, key=birthday_age}
 15.104055404663
 14.209032058716
 18.857002258301
 15.714883804321
 14.73593711853
 15.048027038574
 14.589071273804
 14.847040176392
 15.192985534668
 15.115976333618

 postgresql
 {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual 
 time=0.927..4.990 rows=1000 loops=1)}
 {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 
 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)}
 {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 
 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))}
 {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) 
 (actual time=0.634..0.634 rows=0 loops=1)}
 {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 
 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)}
 {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 
 13))}
 {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 
 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)}
 {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 
 22))}
 {QUERY PLAN=Total runtime: 5.847 ms}
 44.173002243042
 41.156768798828
 39.988040924072
 40.470123291016
 40.035963058472
 40.077924728394
 40.94386100769
 40.183067321777
 39.83211517334
 40.256977081299

 I also wonder why the reported runtime of 5.847 ms is so much different to 
 the runtime reported of my scripts (both php and ruby are almost the same). 
 What's the best tool to time queries in postgresql? Can this be done from 
 pgadmin?

 Thanks,
 Corin


 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Bad query plan inside EXISTS clause

2010-03-10 Thread Kenneth Marshall

EXISTS matches NULLs too and since they are not indexed a
sequential scan is needed to check for them. Try using
IN instead.

Cheers,
Ken

On Wed, Mar 10, 2010 at 02:26:20PM +0100, Benoit Delbosc wrote:
 Hi all,

 I am trying to understand why inside an EXISTS clause the query planner  
 does not use the index:

 EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache
  WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf');
 QUERY PLAN 
 
  Result  (cost=1.19..1.20 rows=1 width=0) (actual time=466.317..466.318 
 rows=1 loops=1)
One-Time Filter: $0
InitPlan 1 (returns $0)
  -  Seq Scan on read_acls_cache  (cost=0.00..62637.01 rows=52517 
 width=0) (actual time=466.309..466.309 rows=1 loops=1)
Filter: ((users_md5)::text = 
 '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
  Total runtime: 466.369 ms
 (6 rows)

 While it does use the index when executing only the subquery:

 EXPLAIN ANALYZE SELECT 1 FROM read_acls_cache WHERE users_md5 = 
 '9bc9012eb29c0bb2ae3cc7b5e78c2acf';
 QUERY PLAN 
 --
  Bitmap Heap Scan on read_acls_cache  (cost=2176.10..35022.98 rows=52517 
 width=0) (actual time=9.065..21.988 rows=51446 loops=1)
Recheck Cond: ((users_md5)::text = 
 '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
-  Bitmap Index Scan on read_acls_cache_users_md5_idx 
 (cost=0.00..2162.97 rows=52517 width=0) (actual time=8.900..8.900 
 rows=51446 loops=1)
  Index Cond: ((users_md5)::text = 
 '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text)
  Total runtime: 25.464 ms
 (5 rows)

 The table has been vacuumed, analyzed and reindexed.

 Thanks for your support.

 Regards

 ben

 Here are some more info :

 \d read_acls_cache
 Table public.read_acls_cache
   Column   | Type  | Modifiers
 ---+---+---
  users_md5 | character varying(34) | not null
  acl_id| character varying(34) |
 Indexes:
 read_acls_cache_users_md5_idx btree (users_md5)


 SELECT COUNT(*) FROM read_acls_cache;
   count
 -
  2520899
 (1 row)


 SELECT COUNT(DISTINCT(users_md5)) FROM read_acls_cache ;
  count
 ---
 49
 (1 row)


 SELECT Version();
   version
 --
  PostgreSQL 8.4.2 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.2.real 
 (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4), 64
 (1 row)



 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] prepared statements and partitioning (partition elimination not working)

2010-03-08 Thread Kenneth Marshall

On Mon, Mar 08, 2010 at 10:24:56AM -0700, Kevin Kempter wrote:
 Hi all;
 
 we've found that partition elimination is not happening for a prepared 
 statement, however running the same statement in psql manually does give us 
 partition elimination.
 
 Is this a known issue?
 

Yes, see the recent threads on performance of prepared queries. 
It concerns the availability of information on the query inputs
that is available to psql and not a pre-prepared query.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] partitioned tables query not using indexes

2010-02-28 Thread Kenneth Marshall

On Sun, Feb 28, 2010 at 12:29:14PM -0800, Josh Berkus wrote:
 
  However the same query against the base table when specifying the check 
  constraint key in the where clause produces sequential scans:
 
 Does the master table have the same indexes as the slave partitions?
 
 --Josh Berkus
 
Does this help? I have an empty base table without indexes and partitions
underneath that do have the index. I did not think that an index on the
parent table did anything.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] index usage in not like

2010-02-18 Thread Kenneth Marshall

On Thu, Feb 18, 2010 at 01:18:10PM +0100, A. Kretschmer wrote:
 In response to Thom Brown :
  On 18 February 2010 11:55, AI Rumman rumman...@gmail.com wrote:
   Not like operation does not use index.
  
   select * from vtiger_contactscf where lower(cf_1253) not like
   lower('Former%')
  
   I created index on lower(cf_1253).
  
   How can I ensure index usage in not like operation?
   Anyone please help.
  
  
  How many rows do you have in your table?  If there are relatively few,
  it probably guesses it to be cheaper to do a sequential scan and
  calculate lower values on-the-fly rather than bother with the index.
 
 That's one reason, an other reason, i think, is, that a btree-index can't
 search with an 'not like' - operator.
 
 
 
 test=*# insert into words select 'fucking example' from 
 generate_series(1,1);
 INSERT 0 1
 test=*# insert into words select 'abc' from generate_series(1,10);
 INSERT 0 10
 test=*# explain select * from words where lower(w)  like lower('a%') or 
 lower(w)  like lower('b%');
  QUERY PLAN
 -
  Bitmap Heap Scan on words  (cost=1538.75..6933.39 rows=55643 width=36)
Recheck Cond: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text))
Filter: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text))
-  BitmapOr  (cost=1538.75..1538.75 rows=57432 width=0)
  -  Bitmap Index Scan on idx_words  (cost=0.00..1027.04 rows=39073 
 width=0)
Index Cond: ((lower(w) ~=~ 'a'::text) AND (lower(w) ~~ 
 'b'::text))
  -  Bitmap Index Scan on idx_words  (cost=0.00..483.90 rows=18359 
 width=0)
Index Cond: ((lower(w) ~=~ 'b'::text) AND (lower(w) ~~ 
 'c'::text))
 (8 rows)
 
 test=*# explain select * from words where lower(w) not like lower('a%') or 
 lower(w)  like lower('b%');
 QUERY PLAN
 ---
  Seq Scan on words  (cost=0.00..10624.48 rows=282609 width=36)
Filter: ((lower(w) !~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text))
 (2 rows)
 
 
 In other words: revert your where-condition from 'not like' to multiple 
 'like' conditions for all letters except 'f%'.
 
 
 Andreas

The 'not like' condition is likely to be extremely non-selective
which would cause a sequential scan to be used in any event whether
or not an index could be used.

Cheers,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Bad plan choice nestloop vs. hashjoin

2010-01-18 Thread Kenneth Marshall

On Mon, Jan 18, 2010 at 12:13:24PM -0500, Tom Lane wrote:
 Kenneth Marshall k...@rice.edu writes:
  We have just upgraded our monitoring server software and
  now the following query for graphing the data performs
  abysmally with the default settings. Here is the results
  of the EXPLAIN ANALYZE run with nestloops enabled:
 
 I poked at this a bit more and now think I see where the problem is.
 The thing that would be easiest for you to do something about is
 the misestimation here:
 
 -  Nested Loop Anti Join  (cost=94.07..10294.64 
  rows=1 width=8) (actual time=98.049..27907.702 rows=281 loops=1)
   Join Filter: (gii.graphid = g.graphid)
   -  Bitmap Heap Scan on graphs g  
  (cost=94.07..233.17 rows=1 width=8) (actual time=0.529..1.772 rows=281 
  loops=1)
 Recheck Cond: (graphid = ANY 
  ('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581!
  
 ,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
 Filter: ((graphid / 
  100::bigint) = 0)
 -  Bitmap Index Scan on graphs_pkey 
   (cost=0.00..94.07 rows=246 width=0) (actual time=0.507..0.507 rows=294 
  loops=1)
   Index Cond: (graphid = ANY 
  ('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835!
  
 ,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[]))
 
 The estimate of the ANY condition is not too bad (246 vs 294 actual).  But it
 hasn't got any ability to deal with the (graphid / 100::bigint) 
 = 0
 filter condition, and is falling back to a default selectivity estimate for
 that, which IIRC is just 0.005 --- but actually, that condition doesn't
 eliminate any rows at all.  Do you need that condition in the first
 place?  Can you persuade your client-side software to eliminate it when
 it's impossible based on the ANY list?  Or at least recast it to
 something more easily estimatable, like graphid  100?
 
 If you really have to have the condition just like that, I'd advise
 creating an index on (graphid / 100::bigint).  That would
 cause ANALYZE to accumulate statistics on that expression, which'd
 result in a far better estimate.
 
 The reason that this misestimate hammers it so hard is that the
 inside of the nestloop looks like
 
   -  Nested Loop  (cost=0.00..17449.43 
  rows=1954 width=8) (actual time=99.304..99.304 rows=0 loops=281)
 -  Index Scan using graphs_items_2 
  on graphs_items gii  (cost=0.00..69.83 rows=1954 width=16) (actual 
  time=0.013..3.399 rows=1954 loops=281)
 -  Index Scan using

[PERFORM] Bad plan choice nestloop vs. hashjoin

2010-01-15 Thread Kenneth Marshall

Dear performance group:

We have just upgraded our monitoring server software and
now the following query for graphing the data performs
abysmally with the default settings. Here is the results
of the EXPLAIN ANALYZE run with nestloops enabled:

SET enable_nestloop = 'on';
EXPLAIN SELECT g.graphid FROM graphs g,graphs_items gi,items i,hosts_groups 
hg,rights r,users_groups ug WHERE  (g.graphid/100) in (0)  AND 
gi.graphid=g.graphid AND i.itemid=gi.itemid AND hg.hostid=i.hostid AND 
r.id=hg.groupid  AND r.groupid=ug.usrgrpid AND ug.userid=20 AND r.permission=2 
AND NOT EXISTS(  SELECT gii.graphid  FROM graphs_items gii, items ii  WHERE 
gii.graphid=g.graphid  AND gii.itemid=ii.itemid  AND EXISTS(  SELECT 
hgg.groupid  FROM hosts_groups hgg, rights rr, users_groups ugg  WHERE 
ii.hostid=hgg.hostid  AND rr.id=hgg.groupid  AND rr.groupid=ugg.usrgrpid  AND 
ugg.userid=20 AND rr.permission2)) AND  (g.graphid IN 
(2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010));
-








QUERY PLAN  







  
--
 Nested Loop  (cost=94.07..10304.00 rows=1 width=8) (actual 
time=194.557..27975.338 rows=607 loops=1)
   Join Filter: (r.groupid = ug.usrgrpid)
   -  Seq Scan on users_groups ug  (cost=0.00..1.15 rows=1 width=8) (actual 
time=0.020..0.026 rows=1 loops=1)
 Filter: (userid = 20)
   -  Nested Loop  (cost=94.07..10302.65

Re: [PERFORM] Bad plan choice nestloop vs. hashjoin

2010-01-15 Thread Kenneth Marshall

On Fri, Jan 15, 2010 at 04:58:57PM -0600, Kevin Grittner wrote:
 Kenneth Marshall k...@rice.edu wrote:
  
  with the default settings
  
 Do you mean you haven't changed any settings in your postgresql.conf
 file from their defaults?
  
 -Kevin
 
Sorry, here are the differences from the default:

max_connections = 100   # (change requires restart)
shared_buffers = 256MB  # min 128kB or max_connections*16kB
work_mem = 16MB # min 64kB
maintenance_work_mem = 512MB# min 1MB
synchronous_commit = off# immediate fsync at commit
wal_buffers = 256kB # min 32kB
checkpoint_segments = 30# in logfile segments, min 1, 16MB each
seq_page_cost = 1.0 # measured on an arbitrary scale
random_page_cost = 2.0  # same scale as above
effective_cache_size = 12GB
log_min_duration_statement = 5000

The machine has 16GB of RAM and the DB is currently about 8GB. It
is going to grow much larger as information is acquired.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PG optimization question

2010-01-09 Thread Kenneth Marshall

On Sat, Jan 09, 2010 at 03:42:08PM +0300, Nickolay wrote:
 I do not see any way to normalize this table anymore. it's size is 4Gig for 
 ~4M rows, i.e. 1Kb per row, i think it's ok.
 Also there are 2 indexes: by date_time and by a couple of service fields 
 (total index size is 250Mb now).
 I think i'll be going to partition by months (approx. 1M rows or 1Gig per 
 month), so it would be like 60 partitions for 5 years. Is that OK for 
 postgres?

Not a problem. We have a log server that has 64 daily partitions.

 Oh, btw, 95% of queries are searching rows for current date (last 24 
 hours).

You may want to use a daily staging table and then flush to the 
monthly archive tables at the end of the day.

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-08 Thread Kenneth Marshall

On Fri, Jan 08, 2010 at 12:38:46PM -0500, Carlo Stonebanks wrote:
 I thought that post mentioned that the plan
 was one statement in an iteration, and that the cache would have
 been primed by a previous query checking whether there were any rows
 to update.  If that was the case, it might be worthwhile to look at
 the entire flow of an iteration.

 This is the only SQL query in the code in question - the rest of the code 
 manages the looping and commit. The code was copied to PgAdminIII and 
 values written in for the WHERE clause. In order for me to validate that 
 rows would have been updated, I had to run a SELECT with the same WHERE 
 clause in PgAdminIII first to see how many rows would have qualified. But 
 this was for testing purposes only. The SELECT statement does not exist in 
 the code. The vast majority of the rows that will be processed will be 
 updated as this is a backfill to synch the old rows with the values being 
 filled into new columns now being inserted.

 Also, if you ever responded with version and configuration
 information, I missed it.

 This is hosted on a new server the client set up so I am waiting for the 
 exact OS and hardware config. PG Version is PostgreSQL 8.3.6, compiled by 
 Visual C++ build 1400, OS appears to be Windows 2003 x64 Server.

 More than anything, I am more concerned with the long-term use of the 
 system. This particular challenge with the 500M row update is one thing, 
 but I am concerned about the exceptional effort required to do this. Is it 
 REALLY this exceptional to want to update 500M rows of data in this day and 
 age? Or is the fact that we are considering dumping and restoring and 
 dropping indexes, etc to do all an early warning that we don't have a 
 solution that is scaled to the problem?

 Config data follows (I am assuming commented values which I did not include 
 are defaulted).

 Carlo


Hi Carlo,

It all boils down to resource management and constraints. For small
problems relative to the amount of system resources available, little
effort is needed to have satisfactory performance. As the problems
consume more and more of the total resource capacity, you will need
to know more and more in depth about the workings of every piece of
the system to wring the most possible performance out of the system.
Some of the discussions on this topic have covered a smattering of
details that will become increasingly important as your system scales
and determine whether or not it will scale. Many times the need for
updates on such a massive scale do point to normalization problems.

My two cents.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] big select is resulting in a large amount of disk writing by kjournald

2009-12-09 Thread Kenneth Marshall

Hint bit I/O?

Ken

On Wed, Dec 09, 2009 at 01:29:00PM -0500, Joseph S wrote:
 I just installed a shiny new database server with pg 8.4.1 running on 
 CentOS 5.4. After using slony to replicate over my database I decided to  
 do some basic performance tests to see how spiffy my shiny new server is.  
 This machine has 32G ram, over 31 of which is used for the system file 
 cache.

 So I run select count(*) from large_table and I see in xosview a solid 
 block of write activity. Runtime is 28125.644 ms for the first run.  The 
 second run does not show a block of write activity and takes 3327.441 ms

 top shows that this writing is being done by kjournald. What is going on 
 here?  There is not a lot of write activity on this server so there should 
 not be a significant number of dirty cache pages that kjournald would need 
 to write out before it could read in my table. Certainly in the 31G being 
 used for file cache there should be enough non-dirty pages that could be 
 dropped to read in my table w/o having to flush anything to disk. My table 
 size is 2,870,927,360 bytes.

 # cat /proc/sys/vm/dirty_expire_centisecs
 2999

 I restarted postgres and ran a count(*) on an even larger table.

 [local]= explain analyze select count(*) from et;
 QUERY PLAN
 ---
  Aggregate  (cost=6837051.82..6837051.83 rows=1 width=0) (actual 
 time=447240.157..447240.157 rows=1 loops=1)
-  Seq Scan on et  (cost=0.00..6290689.25 rows=218545025 width=0) 
 (actual time=5.971..400326.911 rows=218494524 loops=1)
  Total runtime: 447240.402 ms
 (3 rows)

 Time: 447258.525 ms
 [local]= explain analyze select count(*) from et;
 QUERY PLAN
 --
  Aggregate  (cost=6837113.44..6837113.45 rows=1 width=0) (actual 
 time=103011.724..103011.724 rows=1 loops=1)
-  Seq Scan on et  (cost=0.00..6290745.95 rows=218546995 width=0) 
 (actual time=9.844..71629.497 rows=218496012 loops=1)
  Total runtime: 103011.832 ms
 (3 rows)

 Time: 103012.523 ms

 [local]= select pg_relation_size('et');
  pg_relation_size
 --
   33631543296
 (1 row)


 I posted xosview snapshots from the two runs  at: 
 http://www.tupari.net/2009-12-9/ This time the first run showed a mix of 
 read/write activity instead of the solid write I saw before.

 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Unexpected sequential scan on an indexed column

2009-11-16 Thread Kenneth Marshall

On Mon, Nov 16, 2009 at 12:45:46PM -0800, Eddy Escardo-Raffo wrote:
 Yeah this kind of thing would probably work. Doing this in java with
 separate queries would be easy to code but require multiple round trips.
 Doing it as a stored procedure would be nicer but I'd have to think a little
 more about how to refactor the java code around the query to make this
 happen. Thanks for the suggestion.
 
 Eddy
 

Hi Eddy,

Here is a lookup wrapper that is used in DSPAM to work around
a similar problem. Maybe you can use it as a template for your
function:

create function lookup_tokens(integer,bigint[])
  returns setof dspam_token_data
  language plpgsql stable
  as '
declare
  v_rec record;
begin
  for v_rec in select * from dspam_token_data
where uid=$1
  and token in (select $2[i]
from generate_series(array_lower($2,1),array_upper($2,1)) s(i))
  loop
return next v_rec;
  end loop;
  return;
end;';

Regards,
Ken

 On Mon, Nov 16, 2009 at 9:44 AM, Dave Crooke dcro...@gmail.com wrote:
 
  Hi Eddy
 
  Perhaps a slightly naive suggestion  have you considered
  converting the query to a small stored procedure ('function' in
  Postgres speak)? You can pull the location values, and then iterate
  over a query like this:
 
  select userid from users where location=:x
 
  which is more-or-less guaranteed to use the index.
 
 
  I had a somewhat similar situation recently, where I was passing in a
  list of id's (from outwith Postgres) and it would on occasion avoid
  the index in favour of a full table scan  I changed this to
  iterate over the id's with separate queries (in Java, but using a
  function will achieve the same thing) and went from one 5 minute query
  doing full table scan to a handful of queries doing sub-millisecond
  direct index lookups.
 
  Cheers
  Dave
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] FTS performance with the Polish config

2009-11-14 Thread Kenneth Marshall

On Sat, Nov 14, 2009 at 12:25:05PM +0100, Wojciech Knapik wrote:

 Hello

  I just finished implementing a search engine for my site and found 
 ts_headline extremely slow when used with a Polish tsearch configuration, 
 while fast with English. All of it boils down to a simple testcase, but 
 first some background.

  I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0 (2.6.21), 
 then switched both installations to 8.3.8 (both packages compiled, but 
 provided by the distro - port/emerge). The Polish dictionaries and config 
 were created according to this article (it's in Polish, but the code is 
 self-explanatory):

 http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/

  Now for the testcase:

 text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do 
 eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad 
 minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex 
 ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate 
 velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat 
 cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id 
 est laborum.'

 # explain analyze select ts_headline('polish', text, 
 plainto_tsquery('polish', 'foobar'));
  QUERY PLAN 
 
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=6.407..6.470 rows=1 
 loops=1)
  Total runtime: 6.524 ms
 (2 rows)

 # explain analyze select ts_headline('english', text, 
 plainto_tsquery('english', 'foobar'));
  QUERY PLAN 
 
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.861..0.895 rows=1 
 loops=1)
  Total runtime: 0.935 ms
 (2 rows)

 # explain analyze select ts_headline('simple', text, 
 plainto_tsquery('simple', 'foobar'));
  QUERY PLAN 
 
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.627..0.660 rows=1 
 loops=1)
  Total runtime: 0.697 ms
 (2 rows)

 #

  As you can see, the results differ by an order of magnitude between Polish 
 and English. While in this simple testcase it's a non-issue, in the real 
 world this translates into enormous overhead.

  One of the queries I ran testing my site's search function took 1870ms. 
 When I took that query and changed all ts_headline(foo) calls to just foo, 
 the time dropped below 100ms. That's the difference between something 
 completely unacceptable and something quite useful.

  I can post various details about the hardware, software and specific 
 queries, but the testcases speak for themselves. I'm sure you can easily 
 reproduce my results.

  Hints would be very much appreciated, since I've already spent way more 
 time on this, than I could afford.


 cheers,
 Wojciech Knapik


 PS. A few other details can be found here 
 http://pastie.textmate.org/private/hqnqfnsfsknjyjlffzmog along with 
 snippets of my conversations in #postgresql that lead to this testcase. Big 
 thanks to RhodiumToad for helping me with fts for the last couple days ;]


Hi,

The documentation for ts_headline() states:

ts_headline uses the original document, not a tsvector summary, so it can be 
slow
and should be used with care. A typical mistake is to call ts_headline for every
matching document when only ten documents are to be shown. SQL subqueries can 
help;
here is an example:

SELECT id, ts_headline(body, q), rank
FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
  FROM apod, to_tsquery('stars') q
  WHERE ti @@ q
  ORDER BY rank DESC
  LIMIT 10) AS foo;

It looks like you have proven that behavior. I have not looked at the 
ts_headline
code, but it may also be slowed by the locale, so showing that it is faster for
English is not really saying much. Maybe there is a better algorithm that could
be used, but that would require code changes. It may be that you can change some
of the parameters to speed it up. Good luck.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] limiting performance impact of wal archiving.

2009-11-10 Thread Kenneth Marshall

On Tue, Nov 10, 2009 at 12:55:42PM +0100, Laurent Laborde wrote:
 Hi !
 We recently had a problem with wal archiving badly impacting the
 performance of our postgresql master.
 And i discovered cstream, that can limite the bandwidth of pipe stream.
 
 Here is our new archive command, FYI, that limit the IO bandwidth to 500KB/s  
 :
 archive_command = '/bin/cat %p | cstream -i  -o  -t -500k | nice
 gzip -9 -c | /usr/bin/ncftpput etc...'
 
 
 PS : While writing that mail, i just found that i could replace :
 cat %p | cstream -i  ...
 with
 cstream -i %p ...
 *grins*
 

And here is a simple perl program that I have used for a similar
reason. Obviously, it can be adapted to your specific needs.

Regards,
Ken

throttle.pl---
#!/usr/bin/perl -w

require 5.0;# written for perl5, hasta labyebye perl4

use strict;
use Getopt::Std;

#
# This is an simple program to throttle network traffic to a
# specified  KB/second to allow a restore in the middle of the
# day over the network.
#

my($file, $chunksize, $len, $offset, $written, $rate, $buf );
my($options, $blocksize, $speed, %convert, $inv_rate, $verbose);

%convert = (  # conversion factors for $speed,$blocksize
'', '1',
'w','2',
'W','2',
'b','512',
'B','512',
'k','1024',
'K','1024',
);

$options = 'vhs:r:b:f:';

#
# set defaults
#
$speed = '100k';
$rate = '5';
$blocksize = '120k';  # Works for the DLT drives under SunOS
$file = '-';
$buf = '';
$verbose = 0; # default to quiet

sub usage {
  my($usage);

  $usage = Usage: throttle [-s speed][-r rate/sec][-b blksize][-f file][-v][-h]
  (writes data to STDOUT)
  -s speed   max data rate in B/s - defaults to 100k 
  -r ratewrites/sec - defaults to 5
  -b sizeread blocksize - defaults to 120k
  -f filefile to read for input - defaults to STDIN
  -h print this message
  -v print parameters used
;

  print STDERR $usage;
  exit(1);
}

getopts($options) || usage;

if ($::opt_h || $::opt_h) {
  usage;
}

usage unless $#ARGV  0;

$speed = $::opt_s  if $::opt_s;
$rate = $::opt_r   if $::opt_r;
$blocksize = $::opt_b  if $::opt_b;
$file = $::opt_f   if $::opt_f;

#
# Convert $speed and $blocksize to bytes for use in the rest of the script
if ( $speed =~ /^(\d+)([wWbBkK]*)$/ ) {
  $speed = $1 * $convert{$2};
}
if ( $blocksize =~ /^(\d+)([wWbBkK]*)$/ ) {
  $blocksize = $1 * $convert{$2};
}
$inv_rate = 1/$rate;
$chunksize = int($speed/$rate);
$chunksize = 1 if $chunksize == 0;

if ($::opt_v || $::opt_v) {
  print STDERR speed = $speed B/s\nrate = $rate/sec\nblocksize = $blocksize 
B\nchunksize = $chunksize B\n;
}

# Return error if unable to open file
open(FILE, $file) or die Cannot open $file: $!\n;

# Read data from stdin and write it to stdout at a rate based
# on $rate and $speed.
#
while($len = sysread(FILE, $buf, $blocksize)) {
  #
  # print out in chunks of $speed/$rate size to allow a smoother load
  $offset = 0;
  while ($len) {
$written = syswrite(STDOUT, $buf, $chunksize, $offset);
  die System write error: $!\n unless defined $written;
$len -= $written;
$offset += $written;
#
# Now wait 1/$rate seconds before doing the next block
#
select(undef, undef, undef, $inv_rate);
  }
}

close(FILE);

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Are folks running 8.4 in production environments? and 8.4 and slon 1.2?

2009-10-13 Thread Kenneth Marshall

On Tue, Oct 13, 2009 at 01:03:10AM -0600, Scott Marlowe wrote:
 On Mon, Oct 12, 2009 at 1:06 PM, Tory M Blue tmb...@gmail.com wrote:
  Any issues, has it baked long enough, is it time for us 8.3 folks to deal
  with the pain and upgrade?
 
 I am running 8.4.1 for my stats and search databases, and it's working fine.
 
  Anymore updates regarding 8.4 and slon 1.2 as well, since I usually
  build/upgrade both at the same time.
 
 I don't think 1.2 supports 8.4 just yet, and 2.0.3  or so is still not
 stable enough for production (I had major unexplained outages with it)
 so for now, no 8.4 with slony.
 
slony-1.2.17-rc2 works fine with version 8.4 in my limited testing.
I have not been able to get replication to work reliably with any
current release of slony-2.x. There was a recent comment that the
latest version in CVS has the 2.x bug fixed but I have not had a
chance to try.

Regards,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-09 Thread Kenneth Marshall

On Fri, Oct 09, 2009 at 08:31:54PM +0800, Xia Qingran wrote:
 On Mon, Oct 5, 2009 at 9:58 AM, Omar Kilani omar.kil...@gmail.com wrote:
  Hi Xia,
 
  Try this patch:
 
  http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch
 
  It's a hack, but it works for us. I think you're probably spending
  most of your query time planning, and this patch helps speed things up
  10x over here.
 
 Thanks!
 I am trying it.
 
 Regards,
 
 Xia Qingran
 

We have a similar situation when using DSPAM with a PostgreSQL
backend. In that case we used a function like the following to
speed up the lookups. I do not know if it would be useful in
your situation, but I thought I would post it for the group:

The original query was of the form:

SELECT uid, token, spam_hits, innocent_hits FROM dspam_token_data
WHERE uid = 'xxx' AND token IN (...);

The faster version of the query in the current code is:

SELECT * FROM lookup_tokens(%d, '{...});

where lookup_tokens is defined as follows:

create function lookup_tokens(integer,bigint[])
  returns setof dspam_token_data
  language plpgsql stable
  as '
declare
  v_rec record;
begin
  for v_rec in select * from dspam_token_data
where uid=$1
  and token in (select $2[i]
  from generate_series(array_lower($2,1),
   array_upper($2,1)) s(i))
  loop
return next v_rec;
  end loop;
  return;
end;';

Anyway, you may want to try a similar approach instead of the
posted code change.

Regards,
Ken

 
  Regards,
  Omar
 
  On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote:
  On Sat, Sep 26, 2009 at 10:59 PM, Craig James
  craig_ja...@emolecules.com wrote:
 
  If your user_id is always in a narrow range like this, or even in any 
  range
  that is a small fraction of the total, then add a range condition, like
  this:
 
  select * from event where user_id = 500 and user_id = 0 and user_id in
  (...)
 
  I did this exact same thing in my application and it worked well.
 
  Craig
 
 
  It is a good idea. But In my application, most of the queries' user_id
  are random and difficult to range.
  Thanks anyway.
 
 
 
  --
  ?
  Xia Qingran
  qingran@gmail.com
  Sent from Beijing, 11, China
  Charles de Gaulle ??- The better I get to know men, the more I find
  myself loving dogs. -
  http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html
 
  --
  Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 
 
 
 
 
 -- 
 ?
 Xia Qingran
 qingran@gmail.com
 Sent from Beijing, 11, China
 Stephen Leacock  - I detest life-insurance agents: they always argue
 that I shall some day die, which is not so. -
 http://www.brainyquote.com/quotes/authors/s/stephen_leacock.html
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] partition queries hitting all partitions even though check key is specified

2009-09-02 Thread Kenneth Marshall

The planner does not yet work as efficiently as it could
with child tables. Check the recent mail archives for a
long discussion of the same.

Regards,
Ken

On Wed, Sep 02, 2009 at 08:52:30AM -0600, Kevin Kempter wrote:
 Hi all;
 
 I cant figure out why we're scanning all of our partitions.
 
 We setup our tables like this:
 
 
 Base Table:
 
 CREATE TABLE url_hits (
 id integer NOT NULL,
 content_type_id integer,
 file_extension_id integer,
 time integer,
 bytes integer NOT NULL,
 path_id integer,
 protocol public.protocol_enum
 );
 
 Partitions:
 create table url_hits_2011_12 (
check (
   time = extract ('epoch' from timestamp '2011-12-01 
 00:00:00')::int4
   and time = extract ('epoch' from timestamp '2011-12-31 
 23:59:59')::int4
)
 ) INHERITS (url_hits);
 
 
 CREATE RULE url_hits_2011_12_insert as
 ON INSERT TO url_hits
 where
( time = extract ('epoch' from timestamp '2011-12-01 00:00:00')::int4
  and time = extract ('epoch' from timestamp '2011-12-31 
 23:59:59')::int4 )
 DO INSTEAD
   INSERT INTO  url_hits_2011_12 VALUES (NEW.*) ;
 
 ...
 
 create table url_hits_2009_08 (
check (
   time = extract ('epoch' from timestamp '2009-08-01 
 00:00:00')::int4
   and time = extract ('epoch' from timestamp '2009-08-31 
 23:59:59')::int4
)
 ) INHERITS (url_hits);
 
 
 CREATE RULE url_hits_2009_08_insert as
 ON INSERT TO url_hits
 where
( time = extract ('epoch' from timestamp '2009-08-01 00:00:00')::int4
  and time = extract ('epoch' from timestamp '2009-08-31 
 23:59:59')::int4 )
 DO INSTEAD
   INSERT INTO  url_hits_2009_08 VALUES (NEW.*) ;
 
 ... 
 
 the explain plan shows most any query scans/hits all partitions even if we 
 specify the partition key:
 
 explain select * from pwreport.url_hits where time  
 date_part('epoch'::text, '2009-08-12'::timestamp without time zone)::integer; 
   
   
   QUERY PLAN  
 
 --
 
  Result  (cost=0.00..23766294.06 rows=816492723 width=432)
 
-  Append  (cost=0.00..23766294.06 rows=816492723 width=432)  
 
  -  Seq Scan on url_hits  (cost=0.00..12.12 rows=57 width=432)   
 
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_12 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_11 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_10 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_09 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_08 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_07 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_06 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_05 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_04 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)  
 
  -  Seq Scan on url_hits_2011_03 url_hits  (cost=0.00..12.12 rows=57 
 width=432)  
Filter: (time  1250035200)

Re: [PERFORM] Best settings to load a fresh database

2009-08-06 Thread Kenneth Marshall

On Thu, Aug 06, 2009 at 01:42:06PM -0500, Campbell, Lance wrote:
 PostgreSQL 8.3
 Linux RedHat 4.X
 24G of memory
 
 When loading a file generated from pg_dumpall is there a key setting in
 the configuration file that would allow the load to work faster.
 
 Thanks,
 
 Lance Campbell
 Project Manager/Software Architect/DBA
 Web Services at Public Affairs
 217-333-0382
 

I have found that increasing maintenance_work_mem speeds
index rebuilds, turn off synchronous_commit or fsync if
you really can afford to start over. Another big help is
to use the parallel pg_restore from PostgreSQL 8.4.0 to
perform the restore.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] cluster index on a table

2009-06-24 Thread Kenneth Marshall

Clustering reorganizes the layout of a table according to
the ordering of a SINGLE index. This will place items that
are adjacent in the index adjacent in the heap. So you need
to cluster on the index that will help the locality of
reference for the queries which will benefit you the most.
Execution time sensitive queries are a good way to choose.

Cheers,
Ken

On Wed, Jun 24, 2009 at 08:32:14PM +0300, Ibrahim Harrani wrote:
 Hello,
 
 I have a table like following. To increase the performance of this
 table, I would like to  create CLUSTER.
 First, Which index should I use on this table for CLUSTER?
 Secondly,  Can I create multiple CLUSTER  on the same table?
 I will appreciate, if you can suggest other options to increase the
 performance of the table.
 I use this table to save metadata of the mails on my system.
 
 
 mail=# \d maillogs
  Table public.maillogs
Column   |Type |
Modifiers
 +-+---
  id | bigint  | not null default
 nextval('maillogs_id_seq'::regclass)
  queueid| character varying(255)  | not null default
 '*'::character varying
  recvtime   | timestamp without time zone | default now()
  remoteip   | character varying(128)  | not null default
 '0.0.0.0'::character varying
  relayflag  | smallint| not null default
 (0)::smallint
  retaction  | integer |
  retval | integer | not null default 0
  probspam   | double precision| not null default
 (0)::double precision
  messageid  | text|
  fromaddress| text| not null
  toaddress  | text| not null
  envelopesender | text|
  enveloperecipients | text|
  messagesubject | text|
  size   | bigint  |
  logstr | character varying(1024) |
  destinationaddress | character varying(255)  |
  quarantinepath | character varying(1024) | not null default
 ''::character varying
  backuppath | character varying(1024) | not null default
 ''::character varying
  quarantineflag | smallint| not null default
 (0)::smallint
  backupflag | smallint| not null default
 (0)::smallint
  deletedflag| smallint| not null default 0
  profileid  | integer | not null default 0
 Indexes:
 maillogs_pkey PRIMARY KEY, btree (id) CLUSTER
 idx_maillogs_backupflag btree (backupflag)
 idx_maillogs_deletedflag btree (deletedflag)
 idx_maillogs_enveloperecipients btree (enveloperecipients)
 idx_maillogs_envelopesender btree (envelopesender)
 idx_maillogs_messagesubject btree (messagesubject)
 idx_maillogs_quarantineflag btree (quarantineflag)
 idx_maillogs_recvtime btree (recvtime)
 idx_maillogs_remoteip btree (remoteip)
 idx_maillogs_revtal btree (retval)
 Foreign-key constraints:
 maillogs_profileid_fkey FOREIGN KEY (profileid) REFERENCES
 profiles(profileid)
 Triggers:
 maillogs_insert AFTER INSERT ON maillogs FOR EACH ROW EXECUTE
 PROCEDURE maillogs_insert()
 
 mail=#
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 8.4 COPY performance regression on Solaris

2009-06-19 Thread Kenneth Marshall

Hi,

Looking at the XLogInsert() from 8.3 and 8.4, the 8.4
version includes a call to RecoveryInProgress() at
the top as well as a call to TRACE_POSTGRESQL_XLOG_INSERT().
Could either of those have caused a context switch or
cache flush resulting in worse performance.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Strange performance response for high load times

2009-06-18 Thread Kenneth Marshall

On Thu, Jun 18, 2009 at 08:27:02PM +0200, Peter Alban wrote:
 Hi All,
 
 We are having a reasonably powerful machine for supporting about 20
 databases but in total they're not more then 4GB in size.
 
 The machine is 2 processor 8 core and 8 Gig or ram so I would expect that PG
 should cache the whole db into memory. Well actually it doesn't.
 
 What is more strange that a query that under zero load is running under
 100ms during high load times it can take up to 15 seconds !!
 What on earth can make such difference ?
 
 here are the key config options that I set up :
 # - Memory -
 
 shared_buffers = 17 # min 16 or
 max_connections*2, 8KB each
 temp_buffers = 21000# min 100, 8KB each
 #max_prepared_transactions = 5  # can be 0 or more
 # note: increasing max_prepared_transactions costs ~600 bytes of shared
 memory
 # per transaction slot, plus lock space (see max_locks_per_transaction).
 work_mem = 1048576  # min 64, size in KB
 maintenance_work_mem = 1048576  # min 1024, size in KB

1GB of work_mem is very high if you have more than a couple of
queries that use it.

Ken

 #max_stack_depth = 2048 # min 100, size in KB
 
 # - Free Space Map -
 
 max_fsm_pages = 524298  # min max_fsm_relations*16, 6 bytes
 each
 max_fsm_relations = 32768   # min 100, ~70 bytes each
 
 # - Kernel Resource Usage -
 
 max_files_per_process = 4000# min 25
 #preload_libraries = ''
 
 any ideas ?
 
 cheers,
 Peter

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Strange performance response for high load times

2009-06-18 Thread Kenneth Marshall

On Thu, Jun 18, 2009 at 09:42:47PM +0200, Peter Alban wrote:
 So Ken ,
 
 What do you reckon it should be ? What is the rule of thumb here ?
 
 cheers,
 Peter
 

It really depends on your query mix. The key to remember is that
multiples (possibly many) of the work_mem value can be allocated
in an individual query. You can set it on a per query basis to 
help manage it use, i.e. up it for only the query that needs it.
With our systems, which run smaller number of queries we do use
256MB. I hope that this helps.

Regards,
Ken
 On Thu, Jun 18, 2009 at 8:30 PM, Kenneth Marshall k...@rice.edu wrote:
 
  On Thu, Jun 18, 2009 at 08:27:02PM +0200, Peter Alban wrote:
   Hi All,
  
   We are having a reasonably powerful machine for supporting about 20
   databases but in total they're not more then 4GB in size.
  
   The machine is 2 processor 8 core and 8 Gig or ram so I would expect that
  PG
   should cache the whole db into memory. Well actually it doesn't.
  
   What is more strange that a query that under zero load is running under
   100ms during high load times it can take up to 15 seconds !!
   What on earth can make such difference ?
  
   here are the key config options that I set up :
   # - Memory -
  
   shared_buffers = 17 # min 16 or
   max_connections*2, 8KB each
   temp_buffers = 21000# min 100, 8KB each
   #max_prepared_transactions = 5  # can be 0 or more
   # note: increasing max_prepared_transactions costs ~600 bytes of shared
   memory
   # per transaction slot, plus lock space (see max_locks_per_transaction).
   work_mem = 1048576  # min 64, size in KB
   maintenance_work_mem = 1048576  # min 1024, size in KB
 
  1GB of work_mem is very high if you have more than a couple of
  queries that use it.
 
  Ken
 
   #max_stack_depth = 2048 # min 100, size in KB
  
   # - Free Space Map -
  
   max_fsm_pages = 524298  # min max_fsm_relations*16, 6
  bytes
   each
   max_fsm_relations = 32768   # min 100, ~70 bytes each
  
   # - Kernel Resource Usage -
  
   max_files_per_process = 4000# min 25
   #preload_libraries = ''
  
   any ideas ?
  
   cheers,
   Peter
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Storing sensor data

2009-05-28 Thread Kenneth Marshall

On Thu, May 28, 2009 at 04:55:34PM +0200, Ivan Voras wrote:
 2009/5/28 Heikki Linnakangas heikki.linnakan...@enterprisedb.com:
  Ivan Voras wrote:
 
  I need to store data about sensor readings. There is a known (but
  configurable) number of sensors which can send update data at any time.
  The current state needs to be kept but also all historical records.
  I'm trying to decide between these two designs:
 
  1) create a table for current data, one record for each sensor, update
  this table when a sensor reading arrives, create a trigger that would
  transfer old record data to a history table (of basically the same
  structure)
  2) write only to the history table, use relatively complex queries or
  outside-the-database magic to determine what the current values of the
  sensors are.
 
  3) write only to the history table, but have an INSERT trigger to update the
  table with current data. This has the same performance characteristics as
  1, but let's you design your application like 2.
 
 Excellent idea!
 
  I think I'd choose this approach (or 2), since it can handle out-of-order or
  delayed arrival of sensor readings gracefully (assuming they are timestamped
  at source).
 
 It seems like your approach is currently the winner.
 
  If you go with 2, I'd recommend to still create a view to encapsulate the
  complex query for the current values, to make the application development
  simpler. And if it gets slow, you can easily swap the view with a table,
  updated with triggers or periodically, without changing the application.
 
  The volume of sensor data is potentially huge, on the order of 500,000
  updates per hour. Sensor data is few numeric(15,5) numbers.
 
  Whichever design you choose, you should also consider partitioning the data.
 
 I'll look into it, but we'll first see if we can get away with
 limiting the time the data needs to be available.
 

Mr. Voras,

One big benefit of partitioning is that you can prune old data with
minimal impact to the running system. Doing a large bulk delete would
be extremely I/O impacting without partion support. We use this for
a DB log system and it allows us to simply truncate a day table instead
of a delete -- much, much faster.

Regards,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Storing sensor data

2009-05-28 Thread Kenneth Marshall

On Thu, May 28, 2009 at 05:24:33PM +0200, Ivan Voras wrote:
 2009/5/28 Kenneth Marshall k...@rice.edu:
 
 
  One big benefit of partitioning is that you can prune old data with
  minimal impact to the running system. Doing a large bulk delete would
  be extremely I/O impacting without partion support. We use this for
  a DB log system and it allows us to simply truncate a day table instead
  of a delete -- much, much faster.
 
 Thanks. I'll need to investigate how much administrative overhead and
 fragility partitioning will introduce since the data will also be
 replicated between 2 servers (I'm thinking of using Slony). Any
 experience with this combination?
 

We use Slony1 on a number of databases, but none yet on which we
use data partitioning.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Kenneth Marshall

On Wed, May 06, 2009 at 04:01:03PM +0800, Craig Ringer wrote:
 Dimitri wrote:
 Hi,
 any idea if there is a more optimal execution plan possible for this 
 query:
 select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as 
 hbeg,
 H.END_DATE as hend, H.NOTE as hnote
  from HISTORY H, STAT S
  where S.REF = H.REF_STAT
  and H.REF_OBJECT = '01'
  order by H.HORDER ;

 OK, so you're taking a simple:

history INNER JOIN stat ON (stat.ref = history.ref_stat)

 then filtering for records with a particular value of history.ref_object 
 and finally performing a sort.

 If I'm reading it right, the plan below does a sequential scan on the 
 `stat' table. The stat table only has 1000 rows, so this isn't necessarily 
 an unreasonable choice even if there is an appropriate index and even if 
 not many of the rows will be needed.

 It then does an index scan of the history table looking for tuples with 
 ref_object = '01' (text match). It hash joins the hashed results of 
 the initial seq scan to the results of the index scan, and sorts the 
 result.

 To me, that looks pretty reasonable. You might be able to avoid the hash 
 join in favour of a nested loop scan of stat_ref_idx (looping over records 
 from history.ref_stat where ref_object = '001') by providing a 
 composite index on HISTORY(ref_stat, ref_object). I'm really not too sure, 
 though; plan optimization isn't my thing, I'm just seeing if I can offer a 
 few ideas.

 Table definitions:

 While not strictly necessary, it's a *REALLY* good idea to define a 
 suitable PRIMARY KEY.

 Also, the `CHAR(n)' data type is evil. E.V.I.L. Use `varchar(n)' for 
 bounded-length values, or `text' for unbounded fields, unless you REALLY 
 want the crazy behaviour of `CHAR(n)'.

 I'm a little bit puzzled about why you seem to be doing lots of things with 
 integer values stored in text strings, but that probably doesn't matter too 
 much for the issue at hand.

 NOTE: The same query runs 2 times faster on MySQL.

 With InnoDB tables and proper transactional safety? Or using scary MyISAM 
 tables and a just pray approach to data integrity? If you're using MyISAM 
 tables I'm not surprised; MySQL with MyISAM is stunningly fast, but 
 oh-my-god dangerous.

 --
 Craig Ringer

I just thought I would ask. Are you using the query cache in MySQL?
If that is on, that could be the difference. Another thing to check,
try issuing the selects concurrently: 2 at a time, 5 at a time, 10
at a time... and see if that has an effect on timing. In many of the
benchmarks, MySQL will out perform PostgreSQL for very low numbers of
clients. Once you are using more than a handful, PostgreSQL pulls
ahead. Also, is this a completely static table? i.e. no updates or
inserts. How is the performance with those happening? This should
help you get a clearer picture of the performance.

My two cents.
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Kenneth Marshall

On Wed, May 06, 2009 at 02:49:23PM +0200, Dimitri wrote:
 The story is simple: for the launching of MySQL 5.4 I've done a
 testing comparing available on that time variations of InnoDB engines,
 and at the end by curiosity started the same test with PostgreSQL
 8.3.7 to see if MySQL performance level is more close to PostgreSQL
 now (PG was a strong true winner before). For my big surprise MySQL
 5.4 outpassed 8.3.7...
 However, analyzing the PostgreSQL processing I got a feeling something
 goes wrong on PG side.. So, now I've installed both 8.3.7 and 8.4beta1
 to see more in depth what's going on. Currently 8.4 performs much
 better than 8.3.7, but there is still a room for improvement if such a
 small query may go faster :-)
 
 Rgds,
 -Dimitri
 
 On 5/6/09, Albe Laurenz laurenz.a...@wien.gv.at wrote:
  Dimitri wrote:
  I've run several tests before and now going in depth to understand if
  there is nothing wrong. Due such a single query time difference InnoDB
  is doing 2-3 times better TPS level comparing to PostgreSQL..
 
  Why don't you use MySQL then?
  Or tune PostgreSQL?
 
  Yours,
  Laurenz Albe
 

Another thought, have you tuned PostgreSQL for an in memory database?
Those tuning options may be what is needed to improve the plan chosen
by PostgreSQL.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Kenneth Marshall

No.

Ken
On Wed, May 06, 2009 at 10:04:33PM +0800, Craig Ringer wrote:
 Dimitri wrote:
  Hi Chris,
  
  the only problem I see here is it's 2 times slower vs InnoDB, so
  before I'll say myself it's ok I want to be sure there is nothing else
  to do.. :-)
 
 Can the genetic query optimizer come into play on small queries?
 
 --
 Craig Ringer
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Kenneth Marshall

On Wed, May 06, 2009 at 09:38:59AM +0200, Dimitri wrote:
 Hi,
 
 any idea if there is a more optimal execution plan possible for this query:
 
 select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as hbeg,
 H.END_DATE as hend, H.NOTE as hnote
  from HISTORY H, STAT S
  where S.REF = H.REF_STAT
  and H.REF_OBJECT = '01'
  order by H.HORDER ;
 
 EXPLAIN ANALYZE output on 8.4:
QUERY PLAN
 
  Sort  (cost=4549.75..4555.76 rows=2404 width=176) (actual
 time=1.341..1.343 rows=20 loops=1)
Sort Key: h.horder
Sort Method:  quicksort  Memory: 30kB
-  Hash Join  (cost=33.50..4414.75 rows=2404 width=176) (actual
 time=1.200..1.232 rows=20 loops=1)
  Hash Cond: (h.ref_stat = s.ref)
  -  Index Scan using history_ref_idx on history h
 (cost=0.00..4348.20 rows=2404 width=135) (actual time=0.042..0.052
 rows=20 loops=1)
Index Cond: (ref_object = '01'::bpchar)
  -  Hash  (cost=21.00..21.00 rows=1000 width=45) (actual
 time=1.147..1.147 rows=1000 loops=1)
-  Seq Scan on stat s  (cost=0.00..21.00 rows=1000
 width=45) (actual time=0.005..0.325 rows=1000 loops=1)
  Total runtime: 1.442 ms
 (10 rows)
 
 Table HISTORY contains 200M rows, only 20 needed
 Table STAT contains 1000 rows, only 20 needed to be joined to HISTORY values.
 
 Table definitions:
 
 create table STAT
 (
 REF CHAR(3)not null,
 NAMECHAR(40)   not null,
 NUMBINTnot null
 );
 
 create table HISTORY
 (
 REF_OBJECT  CHAR(10)  not null,
 HORDER  INT   not null,
 REF_STATCHAR(3)   not null,
 BEGIN_DATE  CHAR(12)  not null,
 END_DATECHAR(12)  ,
 NOTECHAR(100)
 );
 
 create unique index stat_ref_idx on STAT( ref );
 create index history_ref_idx on HISTORY( ref_object, horder );
 
 
 NOTE: The same query runs 2 times faster on MySQL.
 
 Any idea?..
 
 Rgds,
 -Dimitri
 
Dimitri,

Is there any chance of profiling the postgres backend to see
where the time is used?

Just an idea,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Any better plan for this query?..

2009-05-06 Thread Kenneth Marshall

On Wed, May 06, 2009 at 04:48:21PM +0200, Dimitri wrote:
 Hi Ken,
 
 yes, I may do it, but I did not expect to come into profiling initially :-)
 I expected there is just something trivial within a plan that I just
 don't know.. :-)
 
 BTW, is there already an integrated profiled within a code? or do I
 need external tools?..
 
 Rgds,
 -Dimitri

I only suggested it because it might have the effect of changing
the sequential scan on the stat table to an indexed scan.

Cheers,
Ken
 
 On 5/6/09, Kenneth Marshall k...@rice.edu wrote:
  On Wed, May 06, 2009 at 09:38:59AM +0200, Dimitri wrote:
  Hi,
 
  any idea if there is a more optimal execution plan possible for this
  query:
 
  select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as
  hbeg,
  H.END_DATE as hend, H.NOTE as hnote
   from HISTORY H, STAT S
   where S.REF = H.REF_STAT
   and H.REF_OBJECT = '01'
   order by H.HORDER ;
 
  EXPLAIN ANALYZE output on 8.4:
 QUERY
  PLAN
  
   Sort  (cost=4549.75..4555.76 rows=2404 width=176) (actual
  time=1.341..1.343 rows=20 loops=1)
 Sort Key: h.horder
 Sort Method:  quicksort  Memory: 30kB
 -  Hash Join  (cost=33.50..4414.75 rows=2404 width=176) (actual
  time=1.200..1.232 rows=20 loops=1)
   Hash Cond: (h.ref_stat = s.ref)
   -  Index Scan using history_ref_idx on history h
  (cost=0.00..4348.20 rows=2404 width=135) (actual time=0.042..0.052
  rows=20 loops=1)
 Index Cond: (ref_object = '01'::bpchar)
   -  Hash  (cost=21.00..21.00 rows=1000 width=45) (actual
  time=1.147..1.147 rows=1000 loops=1)
 -  Seq Scan on stat s  (cost=0.00..21.00 rows=1000
  width=45) (actual time=0.005..0.325 rows=1000 loops=1)
   Total runtime: 1.442 ms
  (10 rows)
 
  Table HISTORY contains 200M rows, only 20 needed
  Table STAT contains 1000 rows, only 20 needed to be joined to HISTORY
  values.
 
  Table definitions:
  
  create table STAT
  (
  REF CHAR(3)not null,
  NAMECHAR(40)   not null,
  NUMBINTnot null
  );
 
  create table HISTORY
  (
  REF_OBJECT  CHAR(10)  not null,
  HORDER  INT   not null,
  REF_STATCHAR(3)   not null,
  BEGIN_DATE  CHAR(12)  not null,
  END_DATECHAR(12)  ,
  NOTECHAR(100)
  );
 
  create unique index stat_ref_idx on STAT( ref );
  create index history_ref_idx on HISTORY( ref_object, horder );
  
 
  NOTE: The same query runs 2 times faster on MySQL.
 
  Any idea?..
 
  Rgds,
  -Dimitri
 
  Dimitri,
 
  Is there any chance of profiling the postgres backend to see
  where the time is used?
 
  Just an idea,
  Ken
 
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] partition question for new server setup

2009-04-28 Thread Kenneth Marshall

On Tue, Apr 28, 2009 at 11:56:25AM -0600, Scott Marlowe wrote:
 On Tue, Apr 28, 2009 at 11:48 AM, Whit Armstrong
 armstrong.w...@gmail.com wrote:
  Thanks, Scott.
 
  Just to clarify you said:
 
  postgres. ?So, my pg_xlog and all OS and logging stuff goes on the
  RAID-10 and the main store for the db goes on the RAID-10.
 
  Is that meant to be that the pg_xlog and all OS and logging stuff go
  on the RAID-1 and the real database (the
  /var/lib/postgresql/8.3/main/base directory) goes on the RAID-10
  partition?
 
 Yeah, and extra 0 jumped in there.  Faulty keyboard I guess. :)  OS
 and everything but base is on the RAID-1.
 
  This is very helpful. ?Thanks for your feedback.
 
  Additionally are there any clear choices w/ regard to filesystem
  types? ?Our choices would be xfs, ext3, or ext4.
 
 Well, there's a lot of people who use xfs and ext3.  XFS is generally
 rated higher than ext3 both for performance and reliability.  However,
 we run Centos 5 in production, and XFS isn't one of the blessed file
 systems it comes with, so we're running ext3.  It's worked quite well
 for us.
 

The other optimizations are using data=writeback when mounting the
ext3 filesystem for PostgreSQL and using the elevator=deadline for
the disk driver. I do not know how you specify that for Ubuntu.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] performance for high-volume log insertion

2009-04-21 Thread Kenneth Marshall

Hi,

I just finished reading this thread. We are currently working on
setting up a central log system using rsyslog and PostgreSQL. It
works well once we patched the memory leak. We also looked at what
could be done to improve the efficiency of the DB interface. On the
rsyslog side, moving to prepared queries allows you to remove the
escaping that needs to be done currently before attempting to
insert the data into the SQL backend as well as removing the parsing
and planning time from the insert. This is a big win for high insert
rates, which is what we are talking about. The escaping process is
also a big CPU user in rsyslog which then hands the escaped string
to the backend which then has to undo everything that had been done
and parse/plan the resulting query. This can use a surprising amount
of additional CPU. Even if you cannot support a general prepared
query interface, by specifying what the query should look like you
can handle much of the low-hanging fruit query-wise.

We are currently using a date based trigger to use a new partition
each day and keep 2 months of logs currently. This can be usefully
managed on the backend database, but if rsyslog supported changing
the insert to the new table on a time basis, the CPU used by the
trigger to support this on the backend could be reclaimed. This
would be a win for any DB backend. As you move to the new partition,
issuing a truncate to clear the table would simplify the DB interfaces.

Another performance enhancement already mentioned, would be to
allow certain extra fields in the DB to be automatically populated
as a function of the log messages. For example, logging the mail queue
id for messages from mail systems would make it much easier to locate
particular mail transactions in large amounts of data.

To sum up, eliminating the escaping in rsyslog through the use of
prepared queries would reduce the CPU load on the DB backend. Batching
the inserts will also net you a big performance increase. Some DB-based
applications allow for the specification of several types of queries,
one for single inserts and then a second to support multiple inserts
(copy). Rsyslog already supports the queuing pieces to allow you to
batch inserts. Just some ideas.

Regards,
Ken

On Tue, Apr 21, 2009 at 09:56:23AM +0100, Richard Huxton wrote:
da...@lang.hm wrote:
On Tue, 21 Apr 2009, Stephen Frost wrote:
* da...@lang.hm (da...@lang.hm) wrote:
while I fully understand the 'benchmark your situation' need, this isn't
that simple.

It really is. You know your application, you know it's primary use
cases, and probably have some data to play with. You're certainly in a
much better situation to at least *try* and benchmark it than we are.
rsyslog is a syslog server. it replaces (or for debian and fedora, has
replaced) your standard syslog daemon. it recieves log messages from every
app on your system (and possibly others), filters, maniulates them, and
then stores them somewhere. among the places that it can store the logs
are database servers (native support for MySQL, PostgreSQL, and Oracle.
plus libdbi for others)

Well, from a performance standpoint the obvious things to do are:
1. Keep a connection open, do NOT reconnect for each log-statement
2. Batch log statements together where possible
3. Use prepared statements
4. Partition the tables by day/week/month/year (configurable I suppose)

The first two are vital, the third takes you a step further. The fourth is
a long-term admin thing.

And possibly
5. Have two connections, one for fatal/error etc and one for info/debug
level log statements (configurable split?). Then you can use the
synchronous_commit setting on the less important ones. Might buy you some
performance on a busy system.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS

other apps then search and report on the data after it is stored. what
apps?, I don't know either. pick your favorite reporting tool and you'll
be a step ahead of me (I don't know a really good reporting tool)
as for sample data, you have syslog messages, just like I do. so you have
the same access to data that I have.
how would you want to query them? how would people far less experianced
that you want to query them?
I can speculate that some people would do two columns (time, everything
else), others will do three (time, server, everything else), and others
will go further (I know some who would like to extract IP addresses
embedded in a message into their own column). some people will index on
the time and host, others will want to do full-text searches of
everything.

Well, assuming it looks much like traditional syslog, I would do something
like: (timestamp, host, facility, priority, message). It's easy enough to
stitch back together if people want that.

PostgreSQL's full-text indexing is quite well suited to logfiles I'd have
thought, since it knows about filenames,

Re: [PERFORM] performance for high-volume log insertion

2009-04-21 Thread Kenneth Marshall

On Tue, Apr 21, 2009 at 08:37:54AM -0700, da...@lang.hm wrote:
Kenneth,
could you join the discussion on the rsyslog mailing list?
rsyslog-users rsys...@lists.adiscon.com

I'm surprised to hear you say that rsyslog can already do batch inserts and
am interested in how you did that.

what sort of insert rate did you mange to get?

David Lang

David,

I would be happy to join the discussion. I did not mean to say
that rsyslog currently supported batch inserts, just that the
pieces that provide stand-by queuing could be used to manage
batching inserts.

Cheers,
Ken

On Tue, 21 Apr 2009, Kenneth Marshall wrote:

Date: Tue, 21 Apr 2009 08:33:30 -0500
From: Kenneth Marshall k...@rice.edu
To: Richard Huxton d...@archonet.com
Cc: da...@lang.hm, Stephen Frost sfr...@snowman.net,
Greg Smith gsm...@gregsmith.com, pgsql-performance@postgresql.org
Subject: Re: [PERFORM] performance for high-volume log insertion
Hi,

Regards,
Ken

It really is. You know your application, you know it's primary use
cases, and probably have some data to play with. You're certainly in a
much better situation to at least *try* and benchmark it than we are.
rsyslog is a syslog server. it replaces (or for debian and fedora, has
replaced) your standard syslog daemon. it recieves log messages from
every
app on your system (and possibly others), filters, maniulates them, and
then stores them somewhere. among the places that it can store the logs
are database servers (native support for MySQL, PostgreSQL, and Oracle.
plus libdbi for others)

The first two are vital, the third takes you a step further. The fourth
is
a long-term admin thing.

http://www.postgresql.org/docs/8.3/interactive/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS

other apps then search and report on the data after it is stored. what
apps?, I don't know either. pick your favorite reporting tool and you'll

Re: [PERFORM] performance for high-volume log insertion

2009-04-21 Thread Kenneth Marshall

On Tue, Apr 21, 2009 at 11:09:18AM -0700, da...@lang.hm wrote:
 On Tue, 21 Apr 2009, Greg Smith wrote:

 On Mon, 20 Apr 2009, da...@lang.hm wrote:

 while I fully understand the 'benchmark your situation' need, this isn't 
 that simple.  in this case we are trying to decide what API/interface to 
 use in a infrastructure tool that will be distributed in common distros 
 (it's now the default syslog package of debian and fedora), there are so 
 many variables in hardware, software, and load that trying to benchmark 
 it becomes effectivly impossible.

 From your later comments, you're wandering a bit outside of what you were 
 asking about here.  Benchmarking the *query* side of things can be 
 extremely complicated.  You have to worry about memory allocation, cold 
 vs. warm cache, scale of database relative to RAM, etc.

 You were asking specifically about *insert* performance, which isn't 
 nearly as complicated.  There are basically three setups:

 1) Disk/controller has a proper write cache.  Writes and fsync will be 
 fast. You can insert a few thousand individual transactions per second.

 2) Disk/controller has a lying write cache.  Writes and fsync will be 
 fast, but it's not safe for database use.  But since (1) is expensive and 
 this one you can get for free jut by using a regular SATA drive with its 
 write cache enabled, you can use this case as a proxy for approximately 
 how (1) would act.  You'll still get a few thousand transactions per 
 second, sustained writes may slow down relative to (1) if you insert 
 enough that you hit a checkpoint (triggering lots of random I/O).

 3) All write caches have been disabled because they were not 
 battery-backed. This is the case if you have a regular SATA drive and you 
 disable its write cache because you care about write durability.  You'll 
 get a bit less than RPM/60 writes/second, so 120 inserts/second with a 
 typical 7200RPM drive. Here batching multiple INSERTs together is critical 
 to get any sort of reasonable performance.

 in case #1 would you expect to get significant gains from batching? doesn't 
 it suffer from problems similar to #2 when checkpoints hit?

Even with a disk controller with a proper write cache, the latency for
single-insert-at-a-time will keep the number of updates to the low
thousands per second (on the controllers I have used). If you can batch
them, it would not be unreasonable to increase performance by an order
of magnitude or more. At the high end, other issues like CPU usage can
restrict performance.

Ken
 In (3), I'd expect that trivia like INSERT vs. COPY and COPY BINARY vs. 
 COPY TEXT would be overwhelmed by the overhead of the commit itself. 
 Therefore you probably want to test with case (2) instead, as it doesn't 
 require any additional hardware but has similar performance to a 
 production-worthy (1). All of the other things you're worried about really 
 don't matter here; you can get an approximate measure of what the 
 performance of the various INSERT/COPY schemes are that is somewhat 
 platform dependant, but the results should be good enough to give you some 
 rule of thumb suggestions for whether optimizations are significant enough 
 to justify the coding effort to implement them or not.

 I'll see about setting up a test in the next day or so. should I be able to 
 script this through psql? or do I need to write a C program to test this?

 I'm not sure whether you're familiar with all the fsync trivia here.  In 
 normal syslog use, there's an fsync call after every write.  You can 
 disable that by placing a - before the file name in /etc/syslog.conf The 
 thing that is going to make database-based writes very different is that 
 syslog's fsync'd writes are unlikely to leave you in a bad state if the 
 drive lies about them, while database writes can.  So someone using syslog 
 on a standard SATA drive isn't getting the write guarantee they think they 
 are, but the downside on a crash is minimal.  If you've got a high-volume 
 syslog environment (100 lines/second), you can't support those as 
 individual database writes unless you've got a battery-backed write 
 controller.  A regular disk just can't process genuine fsync calls any 
 faster than that.  A serious syslog deployment that turns fsync on and 
 expects it to really do its thing is already exposed to this issue though. 
 I think it may be a the case that a lot of people think they have durable 
 writes in their configuration but really don't.

 rsyslog is a little different, instead of just input - disk it does input 
 - queue - output (where output can be many things, including disk or 
 database)

 it's default is to use memory-based queues (and no fsync), but has config 
 options to do disk based queues with a fsync after each update

 David Lang

 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent

Re: [PERFORM] postgreSQL performance 8.2.6 vs 8.3.3

2009-02-20 Thread Kenneth Marshall

On Fri, Feb 20, 2009 at 04:34:23PM -0500, Battle Mage wrote:
 I have a server box that has 4GB of RAM, Quad core CPU AMD Opteron 200.152
 Mhz (1024 KB cache size each) with plenty of hard drive space.
 
 I installed both postgresql 8.2.6 and 8.3.3 on it.  I've created a basic
 test db and used
 pgbench -i -s 1 -U test -h localhost test
 to create a sample test db.
 
 Then, to benchmark the postgreSQLs, I executed this separately on each of
 them:
 pg_bench -h localhost -d test -t 2000 -c 50 -s 50 -U test
 (2000 transactions per client, 50 clients, scalability factor of 50)
 
 Using the above,
 I get on postgreSQL 8.2.6:
 Load average: Between 3.4 and 4.3
 tps = 589 (including connections establishing)
 tps = 590 (excluding connections establishing)
 
 I get on postgreSQL 8.3.3
 Load: Between 4.5 and 5.6
 tps = 949 (including connections establishing)
 tps = 951 (excluding connections establishing)
 
 The amount of tps almost doubled, which is good, but i'm worried about the
 load.  For my application, a load increase is bad and I'd like to keep it
 just like in 8.2.6 (a load average between 3.4 and 4.3).  What parameters
 should I work with to decrease the resulting load average at the expense of
 tps?
 
 Down below is my 8.3.3 configuration file.  I removed everything that is
 commented since if it's commented, it's default value.  I also removed from
 the sample below parameters related to logging.

Please evaluate your load on the 8.3.3 box at 590 tps. If the load is
proportional to the tps than the scaled load will be: 2.8 to 3.5 for
an equivalent tps. There is no free lunch but 8.3 performs much better than
8.2 and I suspect that this trend will continue. :)

Cheers,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [HACKERS] [PERFORM] GIST versus GIN indexes for intarrays

2009-02-13 Thread Kenneth Marshall

On Fri, Feb 13, 2009 at 04:12:53PM +0300, Teodor Sigaev wrote:
 The short-term workaround for Rusty is probably to create his GIN
 index using the intarray-provided gin__int_ops opclass.  But it
 Right
 seems to me that we ought to get rid of intarray's @ and @ operators
 and have the module depend on the core anyarray operators, just as we
 have already done for = and .  Comments?
 Agree, will do. Although built-in anyarray operators have ~N^2 behaviour 
 while intarray's version - only N*log(N)
Is there a way to have the buily-in anyarray opeators be N*log(N)?

Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] strange index performance?

2009-01-26 Thread Kenneth Marshall

On Mon, Jan 26, 2009 at 10:10:13AM +0100, Thomas Finneid wrote:
 Scott Marlowe wrote:

 I'm guessing that you just had more data in the table or something by
 the time you tested that, or some cron job was running in the
 background, or some other issue, not the index.

 It starts from scratch and builds up. Every insert has constant time from 
 the first to the last row, ie. row 1 to row 1.2 billion.
 There is no background jobs or other disturbances.

 Quite a similar machine.  write back cache with battery backed
 controller on the controller?  A really old Areca like an 11xx series
 or a newer one 12xx, 16xx?

 Its an Areca 1220. write back is enabled but it does not have a BBU, 
 because its an development machine and not a production machine.

 0.12 seconds per insert is pretty slow.  10 inserts would take a
 second.  I'm inserting 10,000 rows in about 2 seconds.  Each insert is
 definitely in the 0.12 millisecond range.

 I see the confusion. I use COPY(JDBC) not INSERT, so one transaction 
 contains 2 rows, which is copy inserted in 300 ms, so that gives a per 
 row insert time of 0.015ms. So I actually have pretty decent write 
 performance. If I remove the index, the copy insert only takes about 125ms. 
 So the index update time amounts to half the total update time.

 This still leaves me with the question of why the smaller index (id1,3,4) 
 take longer to update than the larger index (id1,2,3,4)?
 Updating an index like id1,2,3 should take shorter time, I have to test it 
 first to verify, so a similar index, id1,3,4 should take approximately the 
 same time.

 Could it have something to do with the smaller index is more complicated to 
 fill in? Could the placing of the id2 filed in the table have anything to 
 say about it?


It may be that the smaller index has update contention for the same
blocks that the larger index does not.

Cheers,
Ken


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] strange index performance?

2009-01-26 Thread Kenneth Marshall

On Mon, Jan 26, 2009 at 03:49:00PM +0100, Thomas Finneid wrote:
 Kenneth Marshall wrote:
 It may be that the smaller index has update contention for the same
 blocks that the larger index does not.

 Is that an assumption based on both indexes existing? if so I might agree, 
 but if you are talking about only one index existing at a time then could 
 you explain what the basis for you conclusion is?

 regards

 thomas


The small index blocks would look like:

|abcd|efgh|ijkl|...

and the large index:

|axxx|...|bxxx|...|cxxx|... and so on.

Now, if you try to update a-k, the small index will be trying to
update and possibly rearrange/split/... items on the same disk
blocks while the larger index would be updating without contention.
It may not even be block level contention, the same argument applies
to cachelines with in a block.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Question about clustering indexes and restores

2009-01-22 Thread Kenneth Marshall

On Thu, Jan 22, 2009 at 02:52:12PM -0500, Harold A. Gim?nez Ch. wrote:
 Hi list,
 
 Clustering my indexes dramatically improves the query performance of many of
 my queries. Also, the actual clustering takes a very long time for big
 databases, roughly 20 hours. I have two questions about how to improve this:
 
 1. I've tweaked maintenance_mem_max and effective_cache_size to a point
 where the cluster operation uses a good chunk of my physical RAM, and the OS
 does not start swapping. Is there any other parameter I should look at?
 
 2. Reading the documentation for cluster at
 http://www.postgresql.org/docs/8.3/interactive/sql-cluster.html, I see that
 all clustering does is reorder the data on disk to 'match' the order of the
 clustered index. My question is, if I dump a clustered database using
 pg_dump in custom format, is it necessary to cluster after restoring it? Or
 does a dump/restore not guarantee that the order of the data restored is the
 same as the original dumped database?
 
 3. Somewhat related to #2, what is the best way to move data from a staging
 database on one server, to the production environment on a different server?
 I've been using pg_dump/pg_restore, but there must be a better way...
 
 
 Thanks for any pointers,
 
 -Harold

Harold,

There have been discussions on the hackers list about the pessimal
cluster performance. Here is a pointer to the discussion, it seems
that a faster way is to build a new table with the desired orderwith
CREATE TABLE AS ... ORDER BY ...:

http://www.mail-archive.com/pgsql-hack...@postgresql.org/msg121205.html

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] caching indexes and pages?

2009-01-22 Thread Kenneth Marshall

On Thu, Jan 22, 2009 at 10:58:25PM +0100, Thomas Finneid wrote:
 Thomas Markus wrote:

 try to reorganize your data with CLUSTER and create appropriate indixes 
 (dont forget to check statistics).

 One question. Assume I have clustered and new data has been added after 
 that, according to the docs that data is added outside of the clustered 
 data. What happens when I run cluster again? I would assume its smart and 
 to only clusteres the new data, i.e. adding it to the already created 
 clusters, as apporpriate, so the execution time would be a lot lower, 
 right? or would it run through and recluster everything from scratch again?

 thomas

It reclusters again from scratch. You do get better performance on the
reads from the data that is already clustered.

Cheers,
Ken

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] Need help with 8.4 Performance Testing

2008-12-07 Thread Kenneth Marshall

Josh,

Since a number of these performance patches use our hash function, would
it make sense to apply the last patch to upgrade the hash function mix()
to the two function mix()/final()? Since the additional changes increases
the performance of the hash function by another 50% or so. My two cents.

Regards,
Ken

On Sun, Dec 07, 2008 at 11:38:01AM -0800, Josh Berkus wrote:
 Database performance geeks,
 
 We have a number of patches pending for 8.4 designed to improve database 
 performance in a variety of circumstances.  We need as many users as possible 
 to build test versions of PostgreSQL with these patches, and test how well 
 they perform, and report back in some detail.
 
 Particularly, users with unusual hardware architectures (16 or more cores, 
 ARM, Power, SSD, NFS-mounted data) or operating systems (Solaris, OSX, 
 Windows-64) are really helpful.  Testers need to be familiar with building 
 PostgreSQL from source and patching it, as well as basic PostgreSQL Tuning 
 (except for the Wizard Patch) and have some kind of performance test 
 available, ideally something based on your own application use.
 
 If you are going to use pgbench to test, *please* read Greg Smith's notes 
 first: 
 http://www.westnet.com/~gsmith/gregsmith/content/postgresql/pgbench-scaling.htm
 
 The Wiki (http://wiki.postgresql.org/wiki/CommitFest_2008-11) has a full list 
 of patches, but below are the ones in particular we could use help with.
 
 You *do* need to read the entire mail threads which I link to below to 
 understand the patches.  Thanks for your help!
 
 Proposal of PITR performance improvement (Koichi Suzuki):
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Simple postgresql.conf wizard
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Window Functions
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 parallel restore
 (especially need to test on 16+ cores)
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 B-Tree emulation for GIN
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 http://archives.postgresql.org/message-id/[EMAIL PROTECTED]
 
 Also, the following patches currently still have bugs, but when the bugs are 
 fixed I'll be looking for performance testers, so please either watch the 
 wiki or watch this space:
 
 -- Block-level CRC checks (Alvaro Herrera)
 -- Auto Partitioning Patch (Nikhil Sontakke)
 -- posix_fadvise (Gregory Stark)
 -- Hash Join-Filter Pruning using Bloom Filters
 -- On-disk bitmap indexes
 
 Please report your results, with the patchname in the subject line, on this 
 mailing list or on -hackers.  Thank you, and your help will get a better 8.4 
 out sooner.
 
 -- 
 Josh Berkus
 PostgreSQL
 San Francisco
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] 8.3.1 vs 8.2.X on HP-UX PA-RISC 11.11/11.23

2008-06-12 Thread Kenneth Marshall

Are you using the same locales for both?

Ken

On Wed, Jun 11, 2008 at 09:40:20PM -0400, Josh Rovero wrote:
 We run GCC-compiled postgresql on a number
 of HP-UX and Linux boxes.
 
 Our measurements to date show 8.3.1
 performance to be about 30% *worse*
 than 8.2 on HP-UX for the same drink the firehose
 insert/update/delete benchmarks.  Linux
 performance is fine.  
 
 Tweaking the new 8.3.1 synchronous_commit
 and bg writer delays that *should* speed
 things up actually makes them a bit worse,
 again only on HP-UX PA-RISK 11.11 and 11.23.
 
 Right now it's 32 bit, both for 8.2 and 8.3.  
 
 Any hints?
 
 
 
 P. J. Rovero
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine

2008-02-15 Thread Kenneth Marshall

On Fri, Feb 15, 2008 at 01:35:29PM +0100, Peter Schuller wrote:
 Hello,
 
 my impression has been that in the past, there has been a general
 semi-consensus that upping shared_buffers to use the majority of RAM
 has not generally been recommended, with reliance on the buffer cache
 instead being the recommendation.
 
 Given the changes that have gone into 8.3, in particular with regards
 to minimizing the impact of large sequential scans, would it be
 correct to say that given that
 
   - enough memory is left for other PG bits (sort mems and whatnot else)
   - only PG is running on the machine
   - you're on 64 bit so do not run into address space issues
   - the database working set is larger than RAM
 
 it would be generally advisable to pump up shared_buffers pretty much
 as far as possible instead of relying on the buffer cache?
 
 -- 
 / Peter Schuller
 
 PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]'
 Key retrieval: Send an E-Mail to [EMAIL PROTECTED]
 E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org
 
Peter,

PostgreSQL still depends on the OS for file access and caching. I
think that the current recommendation is to have up to 25% of your
RAM in the shared buffer cache.

Ken

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Anyone using a SAN?

2008-02-13 Thread Kenneth Marshall

On Wed, Feb 13, 2008 at 10:56:54AM -0600, Peter Koczan wrote:
 Hi all,
 
 We're considering setting up a SAN where I work. Is there anyone using
 a SAN, for postgres or other purposes? If so I have a few questions
 for you.
 
 - Are there any vendors to avoid or ones that are particularly good?
 
 - What performance or reliability implications exist when using SANs?
 
 - Are there any killer features with SANs compared to local storage?
 
 Any other comments are certainly welcome.
 
 Peter
 

Peter,

The key is to understand your usage patterns, both I/O and query.
SANs can be easily bandwidth limited which can tank your database
performance. There have been several threads in the mailing list
about performance problems caused by the use of a SAN for storage.

Cheers,
Ken

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [PERFORM] Benchmark Data requested --- pgloader CE design ideas

2008-02-07 Thread Kenneth Marshall

On Thu, Feb 07, 2008 at 12:06:42PM -0500, Greg Smith wrote:
 On Thu, 7 Feb 2008, Dimitri Fontaine wrote:

 I was thinking of not even reading the file content from the controller
 thread, just decide splitting points in bytes (0..ST_SIZE/4 -
 ST_SIZE/4+1..2*ST_SIZE/4 etc) and let the reading thread fine-tune by
 beginning to process input after having read first newline, etc.

 The problem I was pointing out is that if chunk#2 moved foward a few bytes 
 before it started reading in search of a newline, how will chunk#1 know 
 that it's supposed to read up to that further point?  You have to stop #1 
 from reading further when it catches up with where #2 started.  Since the 
 start of #2 is fuzzy until some reading is done, what you're describing 
 will need #2 to send some feedback to #1 after they've both started, and 
 that sounds bad to me.  I like designs where the boundaries between threads 
 are clearly defined before any of them start and none of them ever talk to 
 the others.


As long as both processes understand the start condition, there
is not a problem. p1 starts at beginning and processes through chunk2
 offset until it reaches the start condition. p2 starts loading from
chunk2 offset plus the amount needed to reach the start condition, ...

DBfile|---|--x--|x|-x--|
  x chunk1---
 x chunk2
 x chunk3---...

As long as both pieces use the same test, they will each process
non-overlapping segments of the file and still process 100% of the
file.

Ken

 In both cases, maybe it would also be needed for pgloader to be able to 
 have a
 separate thread for COPYing the buffer to the server, allowing it to 
 continue
 preparing next buffer in the meantime?

 That sounds like a V2.0 design to me.  I'd only chase after that level of 
 complexity if profiling suggests that's where the bottleneck really is.

 --
 * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux

2007-08-28 Thread Kenneth Marshall

On Wed, Aug 22, 2007 at 07:33:35PM +0400, Dmitry Potapov wrote:
 Hello!
 
 We run a large (~66Gb) web-backend database on Postgresql 8.2.4 on
 Linux. The hardware is  Dual Xeon 5130 with 16Gb ram, LSI Megaraid U320-2x
 scsi controller w/512Mb writeback cache and a BBU. Storage setup contains 3
 raid10 arrays (data, xlog, indexes, each on different array), 12 HDDs total.
 Frontend application uses jdbc driver, connection pooling and threads.
 
 We've run into an issue of IO storms on checkpoints. Once in 20min
 (which is checkpoint_interval) the database becomes unresponsive for about
 4-8 seconds. Query processing is suspended, server does nothing but writing
 a large amount of data to disks. Because of the db server being stalled,
 some of the web clients get timeout and disconnect, which is unacceptable.
 Even worse, as the new requests come at a pretty constant rate, by the time
 this storm comes to an end there is a huge amount of sleeping app. threads
 waiting for their queries to complete. After the db server comes back to
 life again, these threads wake up and flood it with queries, so performance
 suffer even more, for some minutes after the checkpoint.
 
 It seemed strange to me that our 70%-read db generates so much dirty
 pages that writing them out takes 4-8 seconds and grabs the full bandwidth.
 First, I started to tune bgwriter to a more aggressive settings, but this
 was of no help, nearly no performance changes at all. Digging into the issue
 further, I discovered that linux page cache was the reason. Dirty
 parameter in /proc/meminfo (which shows the amount of ready-to-write dirty
 data currently sitting in page cache) grows between checkpoints from 0 to
 about 100Mb. When checkpoint comes, all the 100mb got flushed out to disk,
 effectively causing a IO storm.
 
 I found this (http://www.westnet.com/~gsmith/content/linux-pdflush.htm
 http://www.westnet.com/%7Egsmith/content/linux-pdflush.htm) document and
 peeked into mm/page-writeback.c in linux kernel source tree. I'm not sure
 that I understand pdflush writeout semantics correctly, but looks like when
 the amount of dirty data is less than dirty_background_ratio*RAM/100,
 pdflush only writes pages in background, waking up every
 dirty_writeback_centisecs and writing no more than 1024 pages
 (MAX_WRITEBACK_PAGES constant). When we hit dirty_background_ratio, pdflush
 starts to write out more agressively.
 
 So, looks like the following scenario takes place: postgresql constantly
 writes something to database and xlog files, dirty data gets to the page
 cache, and then slowly written out by pdflush. When postgres generates more
 dirty pages than pdflush writes out, the amount of dirty data in the
 pagecache is growing. When we're at checkpoint, postgres does fsync() on the
 database files, and sleeps until the whole page cache is written out.
 
 By default, dirty_background_ratio is 2%, which is about 328Mb of 16Gb
 total. Following the curring pdflush logic, nearly this amount of data we
 face to write out on checkpoint effective stalling everything else, so even
 1% of 16Gb is too much. My setup experience 4-8 sec pause in operation even
 on ~100Mb dirty pagecache...
 
  I temporaly solved this problem by setting dirty_background_ratio to
 0%. This causes the dirty data to be written out immediately. It is ok for
 our setup (mostly because of large controller cache), but it doesn't looks
 to me as an elegant solution. Is there some other way to fix this issue
 without disabling pagecache and the IO smoothing it was designed to perform?
 
 -- 
 Regards,
 Dmitry

Dmitry,

You are working at the correct level. The bgwriter performs the I/O smoothing
function at the database level. Obviously, the OS level smoothing function
needed to be tuned and you have done that within the parameters of the OS.
You may want to bring this up on the Linux kernel lists and see if they have
any ideas.

Good luck,

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Update table performance

2007-08-09 Thread Kenneth Marshall

Mark,

You are not alone in the fact that when you post your system
specifications, CPU and memory are always listed while the
disk I/O subsystem invariably is not. This is a very disk
intensive operation and I suspect that your disk system is
maxed-out. If you want it faster, you will need more I/O
capacity.

Regards,
Ken

On Tue, Aug 07, 2007 at 05:58:35AM -0700, Mark Makarowsky wrote:
 I have a table with 4,889,820 records in it.  The
 table also has 47 fields.  I'm having problems with
 update performance.  Just as a test, I issued the
 following update:
 
 update valley set test='this is a test'
 
 This took 905641 ms.  Isn't that kind of slow?  There
 aren't any indexes, triggers, constraints or anything
 on this table.  The version of Postgres is PostgreSQL
 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe
 (GCC) 3.4.2 (mingw-special).  The operating
 environment is Windows 2003 Standard Edition w/service
 pack 2.  It is 2.20 Ghz with 1.0 GB of RAM.  Here is
 the results from Explain:
 
 Seq Scan on valley  (cost=0.00..1034083.57
 rows=4897257 width=601)
 
 Here are the settings in the postgresql.conf.  Any

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] upgraded to pgsql 8.2.4, getting worse performance then 7.4.x

2007-06-04 Thread Kenneth Marshall

On Sat, Jun 02, 2007 at 09:13:32AM -0400, Douglas J Hunley wrote:
 Hello great gurus of performance:
 Our 'esteemed' Engr group recently informed a customer that in their testing, 
 upgrading to 8.2.x improved the performance of our J2EE 
 application approximately 20%, so of course, the customer then tasked me 
 with upgrading them. We dumped their db, removed pgsql, installed the 8.2.4 
 rpms from postgresql.org, did an initdb, and the pg_restored their data. It's 
 been about a week now, and the customer is complaining that in their testing, 
 they are seeing a 30% /decrease/ in general performance. Of course, our Engr 
 group is being less than responsive, and I have a feeling all they're doing 
 is googling for answers, so I'm turning to this group for actual 
 assistance :)
 I'd like to start by examining the poistgresql.conf file. Under 7.4.x, we had 
 spent the better part of their 2 years as a customer tuning and tweaking 
 setting. I've attached the file that was in place at the time of upgrade. I 
 did some cursory googling of my own, and quickly realized that enough has 
 changed in v8 that I'm not comfortable making the exact same modification to 
 their new config file as some options are new, some have gone away, etc. I've 
 attached the existing v8 conf file as well. 
 I'd really like it if someone could assist me in determining which of the v8 
 options need adjusted to be 'functionally equivalent' to the v7 file. Right 
 now, my goal is to get the customer back to the previous level of 
 performance, and only then pursue further optimization. I can provide any and 
 all information needed, but didn't know what to include initially, so I've 
 opted to include the minimal :)
 The DB server in question does nothing else, is running CentOS 4.5, kernel 
 2.6.9-55.ELsmp. Hyperthreading is disabled in the BIOS and there are 2 Xeon 
 3.4Ghz cpus. There is 8Gb of RAM in the machine, and another 8Gb of swap.
 
 Thank you in advance for any and all assistance you can provide.
 -- 
 Douglas J Hunley (doug at hunley.homeip.net) - Linux User #174778
 http://doug.hunley.homeip.net
 

Douglas,

If these are the current config files, it is no wonder that the performance
is worse. Here are the things that need to be changed right from the start.
The old 7.x is on the left and the 8.2 value is on the right. Make them
the same to start and see how it looks then.

setting   7.x current 8.2
--
shared_buffers = 25000 / 32MB (=3906)
sort_mem/work_mem = 15000/ 1MB (=122)
vacuum_mem/maint_work_mem = 10 / 16MB (=1950)
effective_cache = 196608 / 128MB (=15600) should start between 200k-500k

These changes alone should get you back to the performance point you are
expecting. It would also be worth re-evaluating whether or not you should
be disabling enable_mergehashjoin in general, and not just for specific
problem queries. I would also tend to start with an effective_cache at
the higher end on a dedicated DB server. Good luck with your tuning. If
the 8.2 config file you posted is the one that has been in use, these few
changes will restore your performance and then some.

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] PostgreSQL and Xeon MP

2006-03-18 Thread Kenneth Marshall

On Thu, Mar 16, 2006 at 11:45:12AM +0100, Guillaume Smet wrote:
 Hello,
 
 We are experiencing performances problem with a quad Xeon MP and
 PostgreSQL 7.4 for a year now. Our context switch rate is not so high
 but the load of the server is blocked to 4 even on very high load and
 we have 60% cpu idle even in this case. Our database fits in RAM and
 we don't have any IO problem. I saw this post from Tom Lane
 http://archives.postgresql.org/pgsql-performance/2004-04/msg00249.php
 and several other references to problem with Xeon MP and I suspect our
 problems are related to this.
 We tried to put our production load on a dual standard Xeon on monday
 and it performs far better with the same configuration parameters.
 
 I know that work has been done by Tom for PostgreSQL 8.1 on
 multiprocessor support but I didn't find any information on if it
 solves the problem with Xeon MP or not.
 
 My question is should we expect a resolution of our problem by
 switching to 8.1 or will we still have problems and should we consider
 a hardware change? We will try to upgrade next tuesday so we will have
 the real answer soon but if anyone has any experience or information
 on this, he will be very welcome.
 
 Thanks for your help.
 
 --
 Guillaume
 

Guillaume,

We had a similar problem with poor performance on a Xeon DP and 
PostgreSQL 7.4.x. 8.0 came out in time for preliminary testing but
it did not solve the problem and our production systems went live
using a different database product. We are currently testing against
8.1.x and the seemingly bizarre lack of performance is gone. I would
suspect that a quad-processor box would have the same issue. I would
definitely recommend giving 8.1 a try.

Ken

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Extremely irregular query performance

2006-01-13 Thread Kenneth Marshall

On Thu, Jan 12, 2006 at 09:48:41AM +, Simon Riggs wrote:
 On Wed, 2006-01-11 at 22:23 -0500, Tom Lane wrote:
  =?iso-8859-1?Q?Jean-Philippe_C=F4t=E9?= [EMAIL PROTECTED] writes:
   Thanks a lot for this info, I was indeed exceeding the genetic
   optimizer's threshold.  Now that it is turned off, I get
   a very stable response time of 435ms (more or less 5ms) for
   the same query. It is about three times slower than the best
   I got with the genetic optimizer on, but the overall average
   is much lower.
  
  Hmm.  It would be interesting to use EXPLAIN ANALYZE to confirm that the
  plan found this way is the same as the best plan found by GEQO, and
  the extra couple hundred msec is the price you pay for the exhaustive
  plan search.  If GEQO is managing to find a plan better than the regular
  planner then we need to look into why ...
 
 It seems worth noting in the EXPLAIN whether GEQO has been used to find
 the plan, possibly along with other factors influencing the plan such as
 enable_* settings.
 

Is it the plan that is different in the fastest case with GEQO or is it
the time needed to plan that is causing the GEQO to beat the exhaustive
search?

Ken


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Extremely irregular query performance

2006-01-13 Thread Kenneth Marshall

On Thu, Jan 12, 2006 at 03:23:14PM -0500, Jean-Philippe Cote wrote:
 
 
 Can I actully know whether a given plan is excuted with GEQO on ?
 In other words, if I launch 'explain query', I'll get a given plan, but if 
 I re-launch
 the query (withtout the 'explain' keyword), could I get a different
 plan given that GEQO induces some randomness ?
 
 Is it the plan that is different in the fastest case with GEQO or is it
 the time needed to plan that is causing the GEQO to beat the exhaustive
 search?
 
GEQO will be used if the number of joins is over the GEQO limit in
the configuration file. The GEQO process is an iterative random
process to find an query plan. The EXPLAIN results are the plan for that
query, but not neccessarily for subsequent runs. GEQO's advantage is a
much faster plan time than the exhaustive search method normally used.
If the resulting plan time is less than the exhaustive search plan time,
for short queries you can have the GECO run more quickly than the
exhaustive search result. Of course, if you PREPARE the query the plan
time drops out.

Ken

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] large table vs multiple smal tables

2005-07-14 Thread Kenneth Marshall

Nicolas,

These sizes would not be considered large. I would leave them
as single tables.

Ken

On Wed, Jul 13, 2005 at 12:08:54PM +0200, Nicolas Beaume wrote:
 Hello
 
 I have a large database with 4 large tables (each containing at least 
 200 000 rows, perhaps even 1 or 2 million) and i ask myself if it's 
 better to split them into small tables (e.g tables of 2000 rows) to 
 speed the access and the update of those tables (considering that i will 
 have few update but a lot of reading).
 
 Do you think it would be efficient ?
 
 Nicolas, wondering if he hadn't be too greedy
 
 -- 
 
 -
 ? soyez ce que vous voudriez avoir l'air d'?tre ? Lewis Caroll
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] two queries and dual cpu (perplexed)

2005-04-22 Thread Kenneth Marshall

On Thu, Apr 21, 2005 at 08:24:15AM -0400, Jeff wrote:
 
 On Apr 21, 2005, at 7:49 AM, Shoaib Burq (VPAC) wrote:
 
 Now I have not touch the $PGDATA/postgresql.conf (As I know very little
 about memory tuning) Have run VACCUM  ANALYZE.
 
 You should really, really bump up shared_buffers and given you have 8GB 
 of ram this query would likely benefit from more work_mem.
 
 and the time taken is *twice* that for the original. The modification 
 was
 minor. The queries do make use of both CPUs:
 
 Is this an IO intensive query?  If running both in parellel results in 
 2x the run time and you have sufficient cpus it would (to me) indicate 
 you don't have enough IO bandwidth to satisfy the query.
 

I would add to Jeff's comments, that the default configuration parameters
are fairly-to-very conservative which tends to produce plans with more I/O.
Bumping your shared_buffers, work_mem, and effective_cache_size should
allow the planner to favor plans that utilize more memory but require
less I/O. Also, with small amounts of work_mem, hash joins cannot be
used and the planner will resort to nested loops.

Ken

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] What about utility to calculate planner cost constants?

2005-03-27 Thread Kenneth Marshall

On Tue, Mar 22, 2005 at 08:09:40AM -0500, Christopher Browne wrote:
 Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Greg Stark) 
 wrote:
  I don't think it would be very hard at all actually.
 
  It's just a linear algebra problem with a bunch of independent
  variables and a system of equations. Solving for values for all of
  them is a straightforward problem.
 
  Of course in reality these variables aren't actually independent
  because the costing model isn't perfect. But that wouldn't be a
  problem, it would just reduce the accuracy of the results.
 
 Are you certain it's a linear system?  I'm not.  If it was a matter of
 minimizing a linear expression subject to some set of linear
 equations, then we could model this as a Linear Program for which
 there are some perfectly good solvers available.  (Few with BSD-style
 licenses, but we could probably get some insight out of running for a
 while with something that's there...)
 
 I think there's good reason to consider it to be distinctly
 NON-linear, which makes it way more challenging to solve the problem.
 
Non-linear optimization works very well in many cases. Issues such
as local minima can be addressed. In a sense, the planner output
can be treated as a blackbox function and the goodness of the
solution is how well it approximates the actual query times. In this
case, it will be imperative to constrain some of the values to prevent
crazy configurations.

Ken

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Wierd context-switching issue on Xeon

2004-04-26 Thread Kenneth Marshall

On Wed, Apr 21, 2004 at 02:51:31PM -0400, Tom Lane wrote:
 The context swap storm is happening because of contention at the next
 level up (LWLocks rather than spinlocks).  It could be an independent
 issue that just happens to be triggered by the same sort of access
 pattern.  I put forward a hypothesis that the cache miss storm caused by
 the test-and-set ops induces the context swap storm by making the code
 more likely to be executing in certain places at certain times ... but
 it's only a hypothesis.
 
If the context swap storm derives from LWLock contention, maybe using
a random order to assign buffer locks in buf_init.c would prevent
simple adjacency of buffer allocation to cause the storm. Just offsetting
the assignment by the cacheline size should work. I notice that when
initializing the buffers in shared memory, both the buf-meta_data_lock
and the buf-cntx_lock are immediately adjacent in memory. I am not
familiar enough with the flow through postgres to see if there could
be fighting for those two locks. If so, offsetting those by the cache
line size would also stop the context swap storm.

--Ken

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

1 2 >

1 - 100 of 101 matches

Mail list logo