Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-23 Thread Humair Mohammed

I did some further analysis and here are the results:
work_mem;response_time1MB;62 seconds2MB;2 seconds4MB;700 milliseconds8MB;550 
milliseconds
In all cases shared_buffers were set to the default value of 32MB. As you can 
see the 1 to 2 MB jump on the work_mem does wonders. I probably don't need this 
to be any higher than 8 or 16 MB. Thanks to all for help!
Humair
 Date: Mon, 22 Nov 2010 12:00:15 +0100
 Subject: Re: [PERFORM] Query Performance SQL Server vs. Postgresql
 From: t...@fuzzy.cz
 To: huma...@hotmail.com
 CC: pgsql-performance@postgresql.org
 
 
 
  Correct, the optimizer did not take the settings with the pg_ctl reload
  command. I did a pg_ctl restart and work_mem now displays the updated
  value. I had to bump up all the way to 2047 MB to get the response below
  (with work_mem at 1024 MB I see 7 seconds response time) and with 2047 MB
  (which is the max value that can be set for work_mem - anything more than
  that results in a FATAL error because of the limit) the results are below.
 
 Hm, can you post explain plan for the case work_mem=1024MB. I guess the
 difference is due to caching. According to the explain analyze, there are
 just cache hits, no reads.
 
 Anyway the hash join uses only about 40MB of memory, so 1024MB should be
 perfectly fine and the explain plan should be exactly the same as with
 work_mem=2047MB. And the row estimates seem quite precise, so I don't
 think there's some severe overestimation.
 
 Tomas
 
  

Re: [PERFORM] Query Performance SQL Server vs. Postgresql

2010-11-23 Thread Merlin Moncure
On Mon, Nov 22, 2010 at 7:12 PM, Humair Mohammed huma...@hotmail.com wrote:
 I did some further analysis and here are the results:
 work_mem;response_time
 1MB;62 seconds
 2MB;2 seconds
 4MB;700 milliseconds
 8MB;550 milliseconds
 In all cases shared_buffers were set to the default value of 32MB. As you
 can see the 1 to 2 MB jump on the work_mem does wonders. I probably don't
 need this to be any higher than 8 or 16 MB. Thanks to all for help!
 Humair

work_mem directly affects how the query is planned, because certain
types of plans (hash joins ans large sorts) require memory. raising or
lowering shared_buffers OTOH is very subtle and is not something you
tune to improve the execution of a single query...

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-23 Thread Craig Ringer

On 11/22/2010 11:38 PM, Ivan Voras wrote:

On 11/22/10 16:26, Kevin Grittner wrote:

Ivan Vorasivo...@freebsd.org wrote:

On 11/22/10 02:47, Kevin Grittner wrote:

Ivan Voras wrote:


After 16 clients (which is still good since there are only 12
real cores in the system), the performance drops sharply


Yet another data point to confirm the importance of connection
pooling. :-)


I agree, connection pooling will get rid of the symptom. But not
the underlying problem. I'm not saying that having 1000s of
connections to the database is a particularly good design, only
that there shouldn't be a sharp decline in performance when it
does happen. Ideally, the performance should remain the same as it
was at its peek.


Well, I suggested that we add an admission control[1] mechanism,


It looks like a hack (and one which is already implemented by connection
pool software); the underlying problem should be addressed.


My (poor) understanding is that addressing the underlying problem would 
require a massive restructure of postgresql to separate connection and 
session state from executor and backend. Idle connections wouldn't 
require a backend to sit around unused but participating in all-backends 
synchronization and signalling. Active connections over a configured 
maximum concurrency limit would queue for access to a backend rather 
than fighting it out for resources at the OS level.


The trouble is that this would be an *enormous* rewrite of the codebase, 
and would still only solve part of the problem. See the prior discussion 
on in-server connection pooling and admission control.


Personally I think the current approach is clearly difficult for many 
admins to understand and it's unfortunate that it requires external 
software to be effective. OTOH, I'm not sure what the answer is.


--
Craig Ringer


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance under contention

2010-11-23 Thread Ivan Voras
On 24 November 2010 01:11, Craig Ringer cr...@postnewspapers.com.au wrote:
 On 11/22/2010 11:38 PM, Ivan Voras wrote:

 It looks like a hack (and one which is already implemented by connection
 pool software); the underlying problem should be addressed.

 My (poor) understanding is that addressing the underlying problem would
 require a massive restructure of postgresql to separate connection and
 session state from executor and backend. Idle connections wouldn't
 require a backend to sit around unused but participating in all-backends
 synchronization and signalling. Active connections over a configured maximum
 concurrency limit would queue for access to a backend rather than fighting
 it out for resources at the OS level.

 The trouble is that this would be an *enormous* rewrite of the codebase, and
 would still only solve part of the problem. See the prior discussion on
 in-server connection pooling and admission control.

I'm (also) not a PostgreSQL developer so I'm hoping that someone who
is will join the thread, but speaking generally, there is no reason
why this couldn't be a simpler problem which just requires
finer-grained locking or smarter semaphore usage.

I'm not talking about forcing performance out of situation where there
are no more CPU cycles to take, but about degrading gracefully in
those circumstances and not taking a 80%+ drop because of spinning
around in semaphore syscalls.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] postmaster consuming /lots/ of memory with hash aggregate. why?

2010-11-23 Thread Robert Haas
On Fri, Nov 12, 2010 at 11:12 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 if I remember well, you can set a number of group by ALTER TABLE ALTER
 COLUMN SET n_distinct = ..

 maybe you use it.

I'm not sure where the number 40,000 is coming from either, but I
think Pavel's suggestion is a good one.  If you're grouping on a
column with N distinct values, then it stands to reason there will be
N groups, and the planner is known to estimate n_distinct on large
tables, even with very high statistics targets, which is why 9.0
allows a manual override.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance