Re: [PERFORM] PostgreSQL performance problem -> tuning

Richard Huxton Thu, 14 Aug 2003 11:47:01 -0700

On Thursday 07 August 2003 17:30, Yaroslav Mazurak wrote:
> Hi All!
>
>
>       First, thanks for answers!
>
> Richard Huxton wrote:
> > On Thursday 07 August 2003 09:24, Yaroslav Mazurak wrote:
> >>>IIRC there is a limit on filesystem cache on freeBSD. 300MB by default.
> >>>If that is the case, you might have to raise it to make
> >>>effective_cache_size really effective..
> >>
> >>    "Try various sysctls" says nothing for me. I want use *all available
> >>RAM* (of course, without needed for OS use) for PostgreSQL.
> >
> > PG will be using the OS' disk caching.
>
>       I think all applications using OS disk caching. ;)
>       Or you want to say that PostgreSQL tuned for using OS-specific cache
> implementation?
>       Do you know method for examining real size of OS filesystem cache? If I
> understood right, PostgreSQL dynamically use all available RAM minus
> shared_buffers minus k * sort_mem minus effective_cache_size?
>       I want configure PostgreSQL for using _maximum_ of available RAM.


PG's memory use can be split into four areas (note - I'm not a developer so 
this could be wrong).
1. Shared memory - vital so that different connections can communicate with 
each other. Shouldn't be too large, otherwise PG spends too long managing its 
shared memory rather than working on your queries.
2. Sort memory - If you have to sort results during a query it will use up to 
the amount you define in sort_mem and then use disk if it needs any more. 
This is for each sort.
3. Results memory - If you're returning 8000 rows then PG will assemble these 
and send them to the client which also needs space to store the 8000 rows.
4. Working memory - to actually run the queries - stack and heap space to keep 
track of its calculations etc.

Your best bet is to start off with some smallish reasonable values and step 
them up gradually until you don't see any improvement. What is vital is that 
the OS can cache enough disk-space to keep all your commonly used tables and 
indexes in memory - if it can't then you'll see performance drop rapidly as 
PG has to keep accessing the disk.

For the moment, I'd leave the settings roughly where they are while we look at 
the query, then once that's out of the way we can fine-tune the settings.

[snip suggestion to break the query down]
>       Yes, you're right. I've tested a few statements and obtain interesting
> results.
>       SELECT * FROM v_file02wide WHERE... executes about 34 seconds.
>       SELECT showcalc(...); executes from 0.7 seconds (without recursion) up
> to 6.3 seconds if recursion is used! :(
>       This mean, that approximate execute time for fully qualified SELECT
> with about 8K rows is... about 13 hours! :-O

Hmm - not good.

>       Hence, problem is in my function showcalc:

That's certainly the place to start, although we might be able to do something 
with v_file02wide later.

> CREATE OR REPLACE FUNCTION showcalc(VARCHAR(10), VARCHAR(2), VARCHAR(4),
> NUMERIC(16)) RETURNS NUMERIC(16)
> LANGUAGE SQL STABLE AS '
> -- Parameters: code, dd, r020, t071
>       SELECT COALESCE(
>               (SELECT sc.koef * $4
>                       FROM showing AS s NATURAL JOIN showcomp AS sc
>                       WHERE s.kod = $1
>                               AND NOT SUBSTR(acc_mask, 1, 1) = ''[''
>                               AND SUBSTR(acc_mask, 1, 4) = $3
>                               AND SUBSTR(acc_mask, 5, 1) = SUBSTR($2, 1, 1)),
>               (SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2,
> LENGTH(acc_mask) - 2), $2, $3, $4), 0))
>                       FROM showing AS s NATURAL JOIN showcomp AS sc
>                       WHERE s.kod = $1
>                               AND SUBSTR(acc_mask, 1, 1) = ''[''),
>               0) AS showing;
> ';
>
>       BTW, cross join "," with WHERE clause don't improve performance
> relative to NATURAL JOIN.
>       Additionally, with user-defined function beginchar (SUBSTR(..., 1, 1)),
> used for indexing, showcalc executes about 16 seconds. With function
> SUBSTR the same showcalc executes 6 seconds.

Fair enough - substr should be fairly efficient.

[snip explanation of table structures and usage]

I'm not going to claim I understood everything in your explanation, but there 
are a couple of things I can suggest. However, before you go and do any of 
that, can I ask you to post an EXPLAIN ANALYSE of two calls to your 
showcalc() function (once for a simple account, once for one with recursion)? 
You'll need to cut and paste the query as standard SQL since the explain 
won't look inside the function body.

OK - bear in mind that these suggestions are made without the benefit of the 
explain analyse:

1. You could try splitting out the various tags of your mask into different 
fields - that will instantly eliminate all the substr() calls and might make 
a difference. If you want to keep the mask for display purposes, we could 
build a trigger to keep it in sync with the separate flags.

2. Use a "calculations" table and build your results step by step. So - 
calculate all the simple accounts, then calculate the ones that contain the 
simple accounts.

3. You could keep a separate "account_contains" table that might look like:
  acc_id | contains
  A001   | A001
  A002   | A002
  A003   | A003
  A003   | A001
  A004   | A004
  A004   | A003
  A004   | A001

So here A001/A002 are simple accounts but A003 contains A001 too. A004 
contains A003 and A001. The table can be kept up to date automatically using 
some triggers.
This should make it simple to pick up all the accounts contained within the 
target account and might mean you can eliminate the recursion.

>       Now I think about change function showcalc or/and this data
> structures... :)

Post the EXPLAIN ANALYSE first - maybe someone smarter than me will have an 
idea.

>       Anyway, 600Mb is too low for PostgreSQL for executing my query - DBMS
> raise error after 11.5 hours (of estimated 13?). :(

I think the problem is the 13 hours, not the 600MB. Once we've got the query 
running in a reasonable length of time (seconds) then the memory requirements 
will go down, I'm sure.

-- 
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] PostgreSQL performance problem -> tuning

Reply via email to