Re: [PERFORM] Set of related slow queries

2011-06-08 Thread John Williams
Hi All,

Let me clarify this a bit.

The memory values are ridiculous you're completely correct.  I've
since fixed that (it had no effect sadly).

I've adjust the shared buffers to about 400MB.  As per the tuning
guide to set that to around 1/4 of your system memory (the AWS in
question has 1.7GB).  I didn't have the shared buffers set correctly
to start because truthfully I had no idea how to incurease shmmax and
I had to look that up.

The work_mem is very very high for the system It's running on
admittedly.  I'm ok with leaving that though because currently I'm the
only one on the machine at all (this isn't a production set up it's a
testing setup).  Realistically it's only that high because someone
suggusted trying a much higher value (I had already personally set it
to 50MB as that was about 30% larger than the largest sort I found)
and see if that improved the situation (it didn't).

Several of the implications of my current set of data make things look
a little wrong so let me clarify the issue a bit.  The table is
composed of data coming from a games combat log.  Each log represents
about 400k entries.  Since I only really care to look at data from the
perspective of each log, the log_id is infact going to be more most
selective portion of the query.  Right now the table only has two logs
in it making this hard to see.  But it should reflect that the
situation shouldn't get worse over time.  I will basically never be
looking at more than a 400-500k record portion of my entries table at
a time.

This stuff gets really painful because I can't very well predict the
queries so I can't pre calculate and the data isn't like a system log,
I could be accepting uploads of 100's of such logs per day.  The
actual queries that are run are a function of what the user wants to
see.  Their are roughly 5 or so different data views, each of which
takes 15-25 separate queries to calculate all the various graphs and
aggregates.  Frequently I won't be looking at the "overall" entire log
(composed of 400k entries), instead I'll be looking at smaller slices
of the data adding: WHERE seconds_since_start <= 1500 AND seconds
since start <= 4000 or some such with very arbitrary min and max.

Now I should say I've seen almost this exact same work done before for
a different game.  So I can't help but feel I must be missing
something really important either in how I'm setting up my data or how
I'm processing.

Thanks,
John

---

John Williams
42nd Design
Email: jwilli...@42nddesign.com
Skype: druidjaidan
Phone: (520) 440-7239



On Wed, Jun 8, 2011 at 6:36 AM, Shaun Thomas  wrote:
> On 06/08/2011 06:30 AM, t...@fuzzy.cz wrote:
>
>>>  shared_buffers               | 16MB
>>>  work_mem                     | 250MB
>>
>> This seems a bit suspicious. Are you sure you want to keep the
>> shared_buffers so small and work_mem so large at the same time? There
>> probably are workloads where this is the right thing to do, but I doubt
>> this is the case. Why have you set it like this?
>
> I must concur in this case.  I can't imagine any scenario where this makes
> sense. Work-mem is allocated on a per-sort basis, not just per session or
> transaction. So a large query could allocate several of these and run your
> system out of memory and cause the OOM killer to start causing trouble.
>
>> I don't have much experience with running Pg on AWS, but I'd try to
>> increase the shared buffers to say 512MB and decrease the work_mem to
>> 16MB (or something like that).
>
> Easily good minimums. But it looks like your AWS only has 1GB of RAM (based
> on your effective_cache_size), so you may only want to increase it to 256MB.
> That said, reduce your work_mem to 8MB to start, and increase it in 4MB
> increments if it's still too low.
>
> With a setting of 16MB, it has to load data in and out of memory constantly.
> Even if the host OS has cached every single block you'll ever use, that's
> only the raw table contents. Processing hundreds of thousands of rows still
> takes time, you just saved yourself the effort of fetching them from disk,
> shared_buffers is still necessary to do actual work.
>
> Now... I have some issues with your queries, which are likely the fault of
> the Django ORM, but still consider your analyze:
>
>> http://explain.depesz.com/s/vEx
>
> Your bitmap index scan on logparser is hilarious. The estimates are fine.
> 237k rows in 47ms when it expected 217k. If your table really does have 815k
> rows in it, that's not very selective at all. Then it adds a heap scan for
> the remaining where conditions, and you end up with 100k rows it then has to
> sort. That's never going to be fast. 600ms actually isn't terrible for this
> many rows, and it also explains your high CPU.
>
> Then your next one:
>
>> http://explain.depesz.com/s/Rhb
>
> 700ms, mostly because of the HashAggregate caused by grouping by
> round(((seconds_since_start / 42)). You're aggregating by a calculation on
> 100k rows. Again, this will never be "fast" and 700m

Re: [PERFORM] Set of related slow queries

2011-06-08 Thread Shaun Thomas

On 06/08/2011 06:30 AM, t...@fuzzy.cz wrote:


  shared_buffers   | 16MB
  work_mem | 250MB


This seems a bit suspicious. Are you sure you want to keep the
shared_buffers so small and work_mem so large at the same time? There
probably are workloads where this is the right thing to do, but I doubt
this is the case. Why have you set it like this?


I must concur in this case.  I can't imagine any scenario where this 
makes sense. Work-mem is allocated on a per-sort basis, not just per 
session or transaction. So a large query could allocate several of these 
and run your system out of memory and cause the OOM killer to start 
causing trouble.



I don't have much experience with running Pg on AWS, but I'd try to
increase the shared buffers to say 512MB and decrease the work_mem to
16MB (or something like that).


Easily good minimums. But it looks like your AWS only has 1GB of RAM 
(based on your effective_cache_size), so you may only want to increase 
it to 256MB. That said, reduce your work_mem to 8MB to start, and 
increase it in 4MB increments if it's still too low.


With a setting of 16MB, it has to load data in and out of memory 
constantly. Even if the host OS has cached every single block you'll 
ever use, that's only the raw table contents. Processing hundreds of 
thousands of rows still takes time, you just saved yourself the effort 
of fetching them from disk, shared_buffers is still necessary to do 
actual work.


Now... I have some issues with your queries, which are likely the fault 
of the Django ORM, but still consider your analyze:


> http://explain.depesz.com/s/vEx

Your bitmap index scan on logparser is hilarious. The estimates are 
fine. 237k rows in 47ms when it expected 217k. If your table really does 
have 815k rows in it, that's not very selective at all. Then it adds a 
heap scan for the remaining where conditions, and you end up with 100k 
rows it then has to sort. That's never going to be fast. 600ms actually 
isn't terrible for this many rows, and it also explains your high CPU.


Then your next one:

> http://explain.depesz.com/s/Rhb

700ms, mostly because of the HashAggregate caused by grouping by 
round(((seconds_since_start / 42)). You're aggregating by a calculation 
on 100k rows. Again, this will never be "fast" and 700ms is not terrible 
considering all the extra work the engine's doing. Again, your index 
scan returning everything and the kitchen sink is the root cause. Which 
also is evidenced here:


> http://explain.depesz.com/s/JUo

And here:

http://explain.depesz.com/s/VZA

Everything is being caused because it's always using the 
ogparser_entry_event_type_like index to fetch the initial 200k rows. The 
only way to make this faster is to restrict the rows coming back. For 
instance, since you know these values are coming in every day, why 
search through all of history every time?


Why not get your timestamp column involved? Maybe you only need to look 
at Attack, DoT Tick, and Critical Attack event types for the last day, 
or week, or even month. That alone should drastically reduce your row 
counts and give the engine a much smaller data set to aggregate and sort.


The thing is, the way your queries are currently written, as you get 
more data, this is just going to get worse and worse. Grabbing a quarter 
of a table that just gets bigger every day and then getting aggregates 
(group by, etc) is going to get slower every day unless you can restrict 
the result set with more where clauses. If you need reports on a lot of 
this data on a regular basis, consider running a nightly or hourly batch 
to insert them into a reporting table you can check later.


There's a lot you can do here.

--
Shaun Thomas
OptionsHouse | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
stho...@peak6.com

__

See  http://www.peak6.com/email_disclaimer.php
for terms and conditions related to this email

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Set of related slow queries

2011-06-08 Thread Craig Ringer

On 06/08/2011 07:08 PM, t...@fuzzy.cz wrote:

Thanks for including explain analyze output.

Is there any chance you can pop the full explains (not just excerpts) in
here:

http://explain.depesz.com/

?


I believe he already did that - there's a link below each query.


Gah, I'm blind. Thanks.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Set of related slow queries

2011-06-08 Thread tv
--+-
>  shared_buffers   | 16MB
>  work_mem | 250MB

This seems a bit suspicious. Are you sure you want to keep the
shared_buffers so small and work_mem so large at the same time? There
probably are workloads where this is the right thing to do, but I doubt
this is the case. Why have you set it like this?

I don't have much experience with running Pg on AWS, but I'd try to
increase the shared buffers to say 512MB and decrease the work_mem to 16MB
(or something like that).

Undersized shared_buffers might actually be part of the problem - to
access a row, the page needs to be loaded into shared_buffers. Even though
the I/O is very fast (or the page is already in the filesystem page
cache), there's some locking etc. that needs to be done. When the cache is
small (e.g. 16MB) then the pages need to be removed and read again
frequently. This might be one of the reasons why the CPU is 100% utilized.

> SELECT   "logparser_entry"."id"   ,
>  "logparser_entry"."log_id"   ,
>  "logparser_entry"."encounter_id" ,
>  "logparser_entry"."entry_order"  ,
>  "logparser_entry"."timestamp",
>  "logparser_entry"."seconds_since_start"  ,
>  "logparser_entry"."event_type"   ,
>  "logparser_entry"."actor_id" ,
>  "logparser_entry"."actor_relation"   ,
>  "logparser_entry"."target_id",
>  "logparser_entry"."target_relation"  ,
>  "logparser_entry"."pet_owner_id" ,
>  "logparser_entry"."pet_owner_relation"   ,
>  "logparser_entry"."pet_target_owner_id"  ,
>  "logparser_entry"."pet_target_owner_relation",
>  "logparser_entry"."ability_id"   ,
>  "logparser_entry"."effective_value"  ,
>  "logparser_entry"."blocked"  ,
>  "logparser_entry"."absorbed" ,
>  "logparser_entry"."overkill" ,
>  "logparser_entry"."overheal" ,
>  "logparser_entry"."total_value"
> FROM "logparser_entry"
> WHERE(
>   "logparser_entry"."log_id" = 2
>  AND  NOT
>   (
>(
> "logparser_entry"."actor_relation"
> IN (E'Other',
>
> E'N/A')
>AND  "logparser_entry"."actor_relation"
> IS NOT NULL
>)
>   )
>  AND  "logparser_entry"."event_type" IN (E'Attack'  ,
>  E'DoT Tick',
>  E'Critical Attack')
>  )
> ORDER BY "logparser_entry"."entry_order" ASC
> LIMIT1
> http://explain.depesz.com/s/vEx

Well, the problem with this is that it needs to evaluate the whole result
set, sort it by "entry_order" and then get the 1st row. And there's no
index on entry_order, so it has to evaluate the whole result set and then
perform a traditional sort.

Try to create an index on the "entry_order" column - that might push it
towards index scan (to be honest I don't know if PostgreSQL knows it can
do it this way, so maybe it won't work).

> SELECT   (ROUND(logparser_entry.seconds_since_start / 42)) AS "interval",
>  SUM("logparser_entry"."effective_value")  AS
> "effective_value__sum"
> FROM "logparser_entry"
> WHERE(
>   "logparser_entry"."log_id" = 2
>  AND  NOT
>   (
>(
> "logparser_entry"."actor_relation"
> IN (E'Other',
>
> E'N/A')
>AND  "logparser_entry"."actor_relation"
> IS NOT NULL
>)
>   )
>  AND  "logparser_entry"."event_type" IN (E'Attack'  ,
>  E'DoT Tick',
>  E'Critical Attack')
>  )
> GROUP BY (ROUND(logparser_entry.seconds_since_start / 42)),
>  ROUND(logparser_entry.seconds_since_start  / 42)
> ORDER BY "interval" ASC
> http://explain.depesz.com/s/Rhb

Hm, this is probably the best plan possible - not sure how to make it
faster. I'd expect a better performance with larger shared_buffers.

> http://explain.depesz.com/s/JUo

Same as above. Good plan, maybe increase shared_buffers?

> http://explain.depesz.com/s/VZA

Same as above. Good plan, maybe increase shared_buffers.

regards
Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Set of related slow queries

2011-06-08 Thread tv
> Thanks for including explain analyze output.
>
> Is there any chance you can pop the full explains (not just excerpts) in
> here:
>
> http://explain.depesz.com/
>
> ?

I believe he already did that - there's a link below each query.

Tomas


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Set of related slow queries

2011-06-08 Thread Craig Ringer

On 8/06/2011 10:58 AM, John Williams wrote:


->   Bitmap Heap Scan on logparser_entry
(cost=4119.06..21520.55 rows=68787 width=8) (actual
time=107.032..444.864 rows=16168 loops=1)
  Recheck Cond: ((event_type)::text = ANY
('{Attack,"DoT Tick","Critical Attack"}'::text[]))
  Filter: target_relation)::text<>  ALL
('{Other,N/A}'::text[])) OR (NOT (target_relation IS NOT NULL))) AND
(log_id = 2))
  ->   Bitmap Index Scan on
logparser_entry_event_type_like  (cost=0.00..4101.86 rows=217733
width=0) (actual time=46.392..46.392 rows=237151 loops=1)
Index Cond: ((event_type)::text = ANY
('{Attack,"DoT Tick","Critical Attack"}'::text[]))
->   Hash  (cost=196.49..196.49 rows=9749 width=23)
(actual time=19.606..19.606 rows=9749 loops=1)


Thanks for including explain analyze output.

Is there any chance you can pop the full explains (not just excerpts) in 
here:


http://explain.depesz.com/

?

Big query plans tend to get mangled into unreadable garbage by mail 
clients, unfortunately.


--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Set of related slow queries

2011-06-08 Thread John Williams
Hi All,
I'm having issues with a set of fairly related queries in my
application.  EXPLAIN ANALYZE is showing them all to be getting stuck
performing roughly the same operation:
   ->  Bitmap Heap Scan on logparser_entry
(cost=4119.06..21520.55 rows=68787 width=8) (actual
time=107.032..444.864 rows=16168 loops=1)
 Recheck Cond: ((event_type)::text = ANY
('{Attack,"DoT Tick","Critical Attack"}'::text[]))
 Filter: target_relation)::text <> ALL
('{Other,N/A}'::text[])) OR (NOT (target_relation IS NOT NULL))) AND
(log_id = 2))
 ->  Bitmap Index Scan on
logparser_entry_event_type_like  (cost=0.00..4101.86 rows=217733
width=0) (actual time=46.392..46.392 rows=237151 loops=1)
   Index Cond: ((event_type)::text = ANY
('{Attack,"DoT Tick","Critical Attack"}'::text[]))
   ->  Hash  (cost=196.49..196.49 rows=9749 width=23)
(actual time=19.606..19.606 rows=9749 loops=1)

All the queries are being generated by the Django ORM, so they are not
particularly well optimized pretty.  I'd prefer to stay with the ORM
as a lot of the queries are highly variable depending on the request
parameters and so unless their are huge gains to be had by falling
back to raw SQL it will save me a lot of development time to stay with
the ORM.

The table in question (logparser_entry) currently has 815000 records
(but that only represents a very very small amount compared to what
the production server would have to handle,  as this represents only 2
log objects when I would expect easily 100 or more logs to be uploaded
per day).

Nulls should be rare in the fields.

This was being run on an AWS High CPU medium instance.  Obviously not
enoughfor a produciton system, but I would hope it would be more than
adequate for testing when I'm the only one using the app.  I opted for
High CPU because the system doesn't seem to be IO bound even on a
micro instance (nearly 0 wait time according to top) and barely
touches the RAM even when tuned to be aggressive with memory usage.
At the same time it's running 100% cpu usage.

My server config:
Server Config
 name |
  current_setting
--+---
 version  | PostgreSQL 8.4.8 on i686-pc-linux-gnu,
compiled by GCC gcc-4.4.real (Ubuntu/Linaro 4.4.4-14ubuntu5) 4.4.5,
32-bit
 checkpoint_completion_target | 0.9
 effective_cache_size | 1044MB
 external_pid_file| /var/run/postgresql/8.4-main.pid
 fsync| on
 lc_collate   | en_US.UTF-8
 lc_ctype | en_US.UTF-8
 listen_addresses | *
 log_line_prefix  | %t
 log_min_duration_statement   | 250ms
 max_connections  | 25
 max_stack_depth  | 2MB
 port | 5432
 random_page_cost | 4
 server_encoding  | UTF8
 shared_buffers   | 16MB
 synchronous_commit   | off
 TimeZone | UTC
 unix_socket_directory| /var/run/postgresql
 work_mem | 250MB
(20 rows)

To try to make reading the queries easier I've attached a text file
with the queries and links to EXPLAIN ANALYZE outputs as well as
copied them below.  I've tried a lot to tune these queries, but
nothing seems to work.  The queries always spend a large amount of
time in the same place.  Is there something I missing that could
improve these or even a way to rework my schema to speed things up.

Thanks,
John


SELECT   "logparser_entry"."id"   ,
 "logparser_entry"."log_id"   ,
 "logparser_entry"."encounter_id" ,
 "logparser_entry"."entry_order"  ,
 "logparser_entry"."timestamp",
 "logparser_entry"."seconds_since_start"  ,
 "logparser_entry"."event_type"   ,
 "logparser_entry"."actor_id" ,
 "logparser_entry"."actor_relation"   ,
 "logparser_entry"."target_id",
 "logparser_entry"."target_relation"  ,
 "logparser_entry"."pet_owner_id" ,
 "logparser_entry"."pet_owner_relation"   ,
 "logparser_entry"."pet_target_owner_id"  ,
 "logparser_entry"."pet_target_owner_relation",
 "logparser_entry"."ability_id"   ,
 "logparser_entry"."effective_value"  ,
 "logparser_entry"."blocked"  ,
 "logparser_entry"."absorbed" ,
 "logparser_entry"."overkill" ,
 "logparser_entry"."overheal" ,
 "logparser_entry"."total_value"
FROM "logparser_entry"
WHERE(
  "logparser_entry"."log_id" = 2
 AND