Re: [PERFORM] bad performance

2016-12-25 Thread Jeremy Harris
On 17/12/16 23:04, Tom Lane wrote:
> so that you don't need to use
> SELECT DISTINCT?  The sort/unique steps needed to do DISTINCT are
> eating a large part of the runtime,

Does a hash join result in a set of buckets that are then read out
in order?  It might, unless the sort method takes advantage of
partially-sorted inout, be cheaper (by log(num-buckets)) to sort/uniq
each bucket separately (and it would parallelize, too).
-- 
Cheers,
  Jeremy



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad performance

2016-12-17 Thread Tom Lane
Gabliver Faluker  writes:
> It runs for ~5 seconds .

I'm a little skeptical that a 12-way join producing 340K rows
and executing in 5 seconds should be considered "bad performance".

It looks like it'd help some if you increased work_mem enough to let
both sorts happen in-memory rather than externally.  But really, this
is going to take awhile no matter what.  Do you really need all 340K
rows of the result?  Can you improve your data representation so that
you don't need to join quite so many tables to get the answer, and
(probably even more importantly) so that you don't need to use
SELECT DISTINCT?  The sort/unique steps needed to do DISTINCT are
eating a large part of the runtime, and they also form an optimization
fence IIRC.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-09 Thread Xia Qingran
On Mon, Oct 5, 2009 at 9:58 AM, Omar Kilani omar.kil...@gmail.com wrote:
 Hi Xia,

 Try this patch:

 http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch

 It's a hack, but it works for us. I think you're probably spending
 most of your query time planning, and this patch helps speed things up
 10x over here.

Thanks!
I am trying it.

Regards,

Xia Qingran


 Regards,
 Omar

 On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote:
 On Sat, Sep 26, 2009 at 10:59 PM, Craig James
 craig_ja...@emolecules.com wrote:

 If your user_id is always in a narrow range like this, or even in any range
 that is a small fraction of the total, then add a range condition, like
 this:

 select * from event where user_id = 500 and user_id = 0 and user_id in
 (...)

 I did this exact same thing in my application and it worked well.

 Craig


 It is a good idea. But In my application, most of the queries' user_id
 are random and difficult to range.
 Thanks anyway.



 --
 夏清然
 Xia Qingran
 qingran@gmail.com
 Sent from Beijing, 11, China
 Charles de Gaulle  - The better I get to know men, the more I find
 myself loving dogs. -
 http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance





-- 
夏清然
Xia Qingran
qingran@gmail.com
Sent from Beijing, 11, China
Stephen Leacock  - I detest life-insurance agents: they always argue
that I shall some day die, which is not so. -
http://www.brainyquote.com/quotes/authors/s/stephen_leacock.html

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-09 Thread Kenneth Marshall
On Fri, Oct 09, 2009 at 08:31:54PM +0800, Xia Qingran wrote:
 On Mon, Oct 5, 2009 at 9:58 AM, Omar Kilani omar.kil...@gmail.com wrote:
  Hi Xia,
 
  Try this patch:
 
  http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch
 
  It's a hack, but it works for us. I think you're probably spending
  most of your query time planning, and this patch helps speed things up
  10x over here.
 
 Thanks!
 I am trying it.
 
 Regards,
 
 Xia Qingran
 

We have a similar situation when using DSPAM with a PostgreSQL
backend. In that case we used a function like the following to
speed up the lookups. I do not know if it would be useful in
your situation, but I thought I would post it for the group:

The original query was of the form:

SELECT uid, token, spam_hits, innocent_hits FROM dspam_token_data
WHERE uid = 'xxx' AND token IN (...);

The faster version of the query in the current code is:

SELECT * FROM lookup_tokens(%d, '{...});

where lookup_tokens is defined as follows:

create function lookup_tokens(integer,bigint[])
  returns setof dspam_token_data
  language plpgsql stable
  as '
declare
  v_rec record;
begin
  for v_rec in select * from dspam_token_data
where uid=$1
  and token in (select $2[i]
  from generate_series(array_lower($2,1),
   array_upper($2,1)) s(i))
  loop
return next v_rec;
  end loop;
  return;
end;';

Anyway, you may want to try a similar approach instead of the
posted code change.

Regards,
Ken

 
  Regards,
  Omar
 
  On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote:
  On Sat, Sep 26, 2009 at 10:59 PM, Craig James
  craig_ja...@emolecules.com wrote:
 
  If your user_id is always in a narrow range like this, or even in any 
  range
  that is a small fraction of the total, then add a range condition, like
  this:
 
  select * from event where user_id = 500 and user_id = 0 and user_id in
  (...)
 
  I did this exact same thing in my application and it worked well.
 
  Craig
 
 
  It is a good idea. But In my application, most of the queries' user_id
  are random and difficult to range.
  Thanks anyway.
 
 
 
  --
  ?
  Xia Qingran
  qingran@gmail.com
  Sent from Beijing, 11, China
  Charles de Gaulle ??- The better I get to know men, the more I find
  myself loving dogs. -
  http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html
 
  --
  Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 
 
 
 
 
 -- 
 ?
 Xia Qingran
 qingran@gmail.com
 Sent from Beijing, 11, China
 Stephen Leacock  - I detest life-insurance agents: they always argue
 that I shall some day die, which is not so. -
 http://www.brainyquote.com/quotes/authors/s/stephen_leacock.html
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-05 Thread Robert Haas
On Sun, Oct 4, 2009 at 9:58 PM, Omar Kilani omar.kil...@gmail.com wrote:
 Hi Xia,

 Try this patch:

 http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch

 It's a hack, but it works for us. I think you're probably spending
 most of your query time planning, and this patch helps speed things up
 10x over here.

Woof.  I can see that helping in some situations, but what a foot-gun!

...Robert

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-05 Thread Omar Kilani
Robert,

On Mon, Oct 5, 2009 at 11:01 PM, Robert Haas robertmh...@gmail.com wrote:
 On Sun, Oct 4, 2009 at 9:58 PM, Omar Kilani omar.kil...@gmail.com wrote:
 Hi Xia,

 Try this patch:

 http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch

 It's a hack, but it works for us. I think you're probably spending
 most of your query time planning, and this patch helps speed things up
 10x over here.

 Woof.  I can see that helping in some situations, but what a foot-gun!

We've run that patch for about 4 years (originally coded for us by
Neil Conway for 8.2, I think), and have never seen any negatives from
it.

I'm not really sure what the alternatives are -- it never really makes
sense to get the selectivity for thousands of items in the IN clause.
I've never seen a different plan for the same query against a DB with
that patch vs without -- it just takes a huge amount of time longer to
run without it. :)

But yeah, definitely a hack, and should only be used if needed --
hopefully there's some sort of official solution on the horizon. :)

 ...Robert

Regards,
Omar

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-05 Thread Grzegorz Jaśkiewicz
On Mon, Oct 5, 2009 at 1:24 PM, Omar Kilani omar.kil...@gmail.com wrote:



 I'm not really sure what the alternatives are -- it never really makes
 sense to get the selectivity for thousands of items in the IN clause.
 I've never seen a different plan for the same query against a DB with
 that patch vs without -- it just takes a huge amount of time longer to
 run without it. :)

 But yeah, definitely a hack, and should only be used if needed --
 hopefully there's some sort of official solution on the horizon. :)


start using temporary tables, transactions, and joins.
Depending on source of the data (if the source is another query, than just
combine it in one query with join), otherwise create temp table, fill out
with data, and run query with join.
If you do all that in transaction, it will be very fast.

-- 
GJ


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-10-04 Thread Omar Kilani
Hi Xia,

Try this patch:

http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch

It's a hack, but it works for us. I think you're probably spending
most of your query time planning, and this patch helps speed things up
10x over here.

Regards,
Omar

On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote:
 On Sat, Sep 26, 2009 at 10:59 PM, Craig James
 craig_ja...@emolecules.com wrote:

 If your user_id is always in a narrow range like this, or even in any range
 that is a small fraction of the total, then add a range condition, like
 this:

 select * from event where user_id = 500 and user_id = 0 and user_id in
 (...)

 I did this exact same thing in my application and it worked well.

 Craig


 It is a good idea. But In my application, most of the queries' user_id
 are random and difficult to range.
 Thanks anyway.



 --
 夏清然
 Xia Qingran
 qingran@gmail.com
 Sent from Beijing, 11, China
 Charles de Gaulle  - The better I get to know men, the more I find
 myself loving dogs. -
 http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-30 Thread Ivan Voras

Xia Qingran wrote:

On Sun, Sep 27, 2009 at 1:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:

Xia Qingran qingran@gmail.com writes:

I have a big performance problem in my SQL select query:
select * from event where user_id in
(500,499,498, ... ,1,0);
The above SELECT always spends 1200ms.

Your EXPLAIN ANALYZE shows that the actual runtime is only about 240ms.
So either the planning time is about 1000ms, or transmitting and
displaying the 134K rows produced by the query takes that long, or some
combination of the two.  I wouldn't be too surprised if it's the data
display that's slow; but if it's the planning time that you're unhappy
about, updating to a more recent PG release might possibly help.  What
version is this anyway?

   regards, tom lane


Oh, It is a problem.


I don't see where the Total runtime information is in your first message.

Also, did you run VACUUM FULL ANALYZE lately?


Forgot to talk about my platform. I am running PostgreSQL 8.4.0 on
FreeBSD 7.2-amd64 box, which has dual Xeon 5410 CPUs, 8GB memory and 2
SATA disks.

And my postgresql.conf is listed as follow:
---

listen_addresses = '*'  # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 88# (change requires restart)
superuser_reserved_connections = 3
ssl = off   # (change requires restart)
tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 0# TCP_KEEPCNT;
shared_buffers = 2048MB # min 128kB or max_connections*16kB


For start I think you will need to make shared_buffers larger than your 
index to get decent performance - try setting it to 4096 MB and see if 
it helps.



temp_buffers = 32MB # min 800kB
max_prepared_transactions = 150 # can be 0 or more, 0 to shutdown the
prepared transactions.
work_mem = 8MB  # min 64kB


Depending on the type of your workload (how many clients are connected 
and how complex are the queries) you might want to increase work_mem 
also. Try 16 MB - 32 MB or more and see if it helps.



fsync = off # turns forced synchronization on or off
synchronous_commit = off# immediate fsync at commit


Offtopic - you probably know what you are doing by disabling these, right?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-27 Thread Xia Qingran
On Sun, Sep 27, 2009 at 1:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Xia Qingran qingran@gmail.com writes:
 I have a big performance problem in my SQL select query:
 select * from event where user_id in
 (500,499,498, ... ,1,0);
 The above SELECT always spends 1200ms.

 Your EXPLAIN ANALYZE shows that the actual runtime is only about 240ms.
 So either the planning time is about 1000ms, or transmitting and
 displaying the 134K rows produced by the query takes that long, or some
 combination of the two.  I wouldn't be too surprised if it's the data
 display that's slow; but if it's the planning time that you're unhappy
 about, updating to a more recent PG release might possibly help.  What
 version is this anyway?

                        regards, tom lane

Oh, It is a problem.

Forgot to talk about my platform. I am running PostgreSQL 8.4.0 on
FreeBSD 7.2-amd64 box, which has dual Xeon 5410 CPUs, 8GB memory and 2
SATA disks.

And my postgresql.conf is listed as follow:
---

listen_addresses = '*'  # what IP address(es) to listen on;
port = 5432 # (change requires restart)
max_connections = 88# (change requires restart)
superuser_reserved_connections = 3
ssl = off   # (change requires restart)
tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
tcp_keepalives_count = 0# TCP_KEEPCNT;
shared_buffers = 2048MB # min 128kB or max_connections*16kB
temp_buffers = 32MB # min 800kB
max_prepared_transactions = 150 # can be 0 or more, 0 to shutdown the
prepared transactions.
work_mem = 8MB  # min 64kB
maintenance_work_mem = 1024MB   # min 1MB
max_stack_depth = 8MB   # min 100kB
max_files_per_process = 16384   # min 25
vacuum_cost_delay = 100 # 0-1000 milliseconds
vacuum_cost_page_hit = 1# 0-1 credits
vacuum_cost_page_miss = 10  # 0-1 credits
vacuum_cost_page_dirty = 20 # 0-1 credits
vacuum_cost_limit = 500 # 1-1 credits
bgwriter_delay = 500ms  # 10-1ms between rounds
bgwriter_lru_maxpages = 100 # 0-1000 max buffers written/round
bgwriter_lru_multiplier = 2.0   # 0-10.0 multipler on buffers 
scanned/round
fsync = off # turns forced synchronization on or off
synchronous_commit = off# immediate fsync at commit
wal_sync_method = fsync # the default is the first option
full_page_writes = off  # recover from partial page writes
wal_buffers = 2MB   # min 32kB
wal_writer_delay = 200ms# 1-1 milliseconds
commit_delay = 50   # range 0-10, in microseconds
commit_siblings = 5 # range 1-1000
checkpoint_segments = 32# in logfile segments, min 1, 16MB each
checkpoint_timeout = 6min   # range 30s-1h
checkpoint_completion_target = 0.5  # checkpoint target duration, 0.0 - 1.0
checkpoint_warning = 30s# 0 is off
enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on
seq_page_cost = 1.8 # measured on an arbitrary scale
random_page_cost = 2# same scale as above
cpu_tuple_cost = 0.15   # same scale as above
cpu_index_tuple_cost = 0.07 # same scale as above
cpu_operator_cost = 0.03# same scale as above
effective_cache_size = 3072MB
geqo = on
geqo_threshold = 20
geqo_effort =  7# range 1-10
geqo_pool_size = 0  # selects default based on effort
geqo_generations = 0# selects default based on effort
geqo_selection_bias = 2.0   # range 1.5-2.0
default_statistics_target = 500 # range 1-1000
constraint_exclusion = partition
from_collapse_limit = 20
join_collapse_limit = 20# 1 disables collapsing of explicit
log_destination = 'syslog'
syslog_facility = 'LOCAL2'
syslog_ident = 'postgres'
client_min_messages = notice# values in order of decreasing detail:
log_min_messages = error# values in order of decreasing detail:
log_error_verbosity = terse # terse, default, or verbose messages
log_min_error_statement = panic # values in order of decreasing detail:
log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
silent_mode = on
debug_print_parse = off
debug_print_rewritten = off
debug_print_plan = off
debug_pretty_print = off
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = on

Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-27 Thread Xia Qingran
On Sat, Sep 26, 2009 at 10:59 PM, Craig James
craig_ja...@emolecules.com wrote:

 If your user_id is always in a narrow range like this, or even in any range
 that is a small fraction of the total, then add a range condition, like
 this:

 select * from event where user_id = 500 and user_id = 0 and user_id in
 (...)

 I did this exact same thing in my application and it worked well.

 Craig


It is a good idea. But In my application, most of the queries' user_id
are random and difficult to range.
Thanks anyway.



-- 
夏清然
Xia Qingran
qingran@gmail.com
Sent from Beijing, 11, China
Charles de Gaulle  - The better I get to know men, the more I find
myself loving dogs. -
http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-26 Thread Claus Guttesen
 I have a big performance problem in my SQL select query:

 
 select * from event where user_id in
 (500,499,498,497,496,495,494,493,492,491,490,489,488,487,486,485,484,483,482,481,480,479,478,477,476,475,474,473,472,471,470,469,468,467,466,465,464,463,462,461,460,459,458,457,456,455,454,453,452,451,450,449,448,447,446,445,444,443,442,441,440,439,438,437,436,435,434,433,432,431,430,429,428,427,426,425,424,423,422,421,420,419,418,417,416,415,414,413,412,411,410,409,408,407,406,405,404,403,402,401,
 400,399,398,397,396,395,394,393,392,391,390,389,388,387,386,385,384,383,382,381,380,379,378,377,376,375,374,373,372,371,370,369,368,367,366,365,364,363,362,361,360,359,358,357,356,355,354,353,352,351,350,349,348,347,346,345,344,343,342,341,340,339,338,337,336,335,334,333,332,331,330,329,328,327,326,325,324,323,322,321,320,319,318,317,316,315,314,313,312,311,310,309,308,307,306,305,304,303,302,301,
 300,299,298,297,296,295,294,293,292,291,290,289,288,287,286,285,284,283,282,281,280,279,278,277,276,275,274,273,272,271,270,269,268,267,266,265,264,263,262,261,260,259,258,257,256,255,254,253,252,251,250,249,248,247,246,245,244,243,242,241,240,239,238,237,236,235,234,233,232,231,230,229,228,227,226,225,224,223,222,221,220,219,218,217,216,215,214,213,212,211,210,209,208,207,206,205,204,203,202,201,
 200,199,198,197,196,195,194,193,192,191,190,189,188,187,186,185,184,183,182,181,180,179,178,177,176,175,174,173,172,171,170,169,168,167,166,165,164,163,162,161,160,159,158,157,156,155,154,153,152,151,150,149,148,147,146,145,144,143,142,141,140,139,138,137,136,135,134,133,132,131,130,129,128,127,126,125,124,123,122,121,120,119,118,117,116,115,114,113,112,111,110,109,108,107,106,105,104,103,102,101,
 100,99,98,97,96,95,94,93,92,91,90,89,88,87,86,85,84,83,82,81,80,79,78,77,76,75,74,73,72,71,70,69,68,67,66,65,64,63,62,61,60,59,58,57,56,55,54,53,52,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,33,32,31,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0);
 

What happens if you change the query to

select * from event where user_id = 0 and user_id = 500;

? :-)

-- 
regards
Claus

When lenity and cruelty play for a kingdom,
the gentler gamester is the soonest winner.

Shakespeare

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-26 Thread Tom Lane
Xia Qingran qingran@gmail.com writes:
 I have a big performance problem in my SQL select query:
 select * from event where user_id in
 (500,499,498, ... ,1,0);
 The above SELECT always spends 1200ms.

Your EXPLAIN ANALYZE shows that the actual runtime is only about 240ms.
So either the planning time is about 1000ms, or transmitting and
displaying the 134K rows produced by the query takes that long, or some
combination of the two.  I wouldn't be too surprised if it's the data
display that's slow; but if it's the planning time that you're unhappy
about, updating to a more recent PG release might possibly help.  What
version is this anyway?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-26 Thread Grzegorz Jaśkiewicz
if you reuse that set a lot, how about storing it in a table , and doing the
join on db side ? if it is large, it sometimes makes sense to create temp
table just for single query (I use that sort of stuff for comparing with few
M records).
But temp tables in that case have to be short lived, as they can't reuse
space (no FSM in temporary table world I'm afraid, I hope it will be fixed
at some stage tho).


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-26 Thread Craig James

Xia Qingran wrote:

Hi,
I have a big performance problem in my SQL select query:


select * from event where user_id in
(500,499,498,497,496,495,494,493,492,491,490,489,488,487,486,485,484,483,482,481,480,479,478,477,476,475,474,473,472,471,470,469,468,467,466,465,464,463,462,461,460,459,458,457,456,455,454,453,452,451,450,449,448,447,446,445,444,443,442,441,440,439,438,437,436,435,434,433,432,431,430,429,428,427,426,425,424,423,422,421,420,419,418,417,416,415,414,413,412,411,410,409,408,407,406,405,404,403,402,401,
400,399,398,397,396,395,394,393,392,391,390,389,388,387,386,385,384,383,382,381,380,379,378,377,376,375,374,373,372,371,370,369,368,367,366,365,364,363,362,361,360,359,358,357,356,355,354,353,352,351,350,349,348,347,346,345,344,343,342,341,340,339,338,337,336,335,334,333,332,331,330,329,328,327,326,325,324,323,322,321,320,319,318,317,316,315,314,313,312,311,310,309,308,307,306,305,304,303,302,301,
300,299,298,297,296,295,294,293,292,291,290,289,288,287,286,285,284,283,282,281,280,279,278,277,276,275,274,273,272,271,270,269,268,267,266,265,264,263,262,261,260,259,258,257,256,255,254,253,252,251,250,249,248,247,246,245,244,243,242,241,240,239,238,237,236,235,234,233,232,231,230,229,228,227,226,225,224,223,222,221,220,219,218,217,216,215,214,213,212,211,210,209,208,207,206,205,204,203,202,201,
200,199,198,197,196,195,194,193,192,191,190,189,188,187,186,185,184,183,182,181,180,179,178,177,176,175,174,173,172,171,170,169,168,167,166,165,164,163,162,161,160,159,158,157,156,155,154,153,152,151,150,149,148,147,146,145,144,143,142,141,140,139,138,137,136,135,134,133,132,131,130,129,128,127,126,125,124,123,122,121,120,119,118,117,116,115,114,113,112,111,110,109,108,107,106,105,104,103,102,101,
100,99,98,97,96,95,94,93,92,91,90,89,88,87,86,85,84,83,82,81,80,79,78,77,76,75,74,73,72,71,70,69,68,67,66,65,64,63,62,61,60,59,58,57,56,55,54,53,52,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,33,32,31,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0);


The above SELECT always spends 1200ms.


If your user_id is always in a narrow range like this, or even in any range 
that is a small fraction of the total, then add a range condition, like this:

select * from event where user_id = 500 and user_id = 0 and user_id in (...)

I did this exact same thing in my application and it worked well.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance of SELECT ... where id IN (...)

2009-09-26 Thread Paul Ooi


On 26-Sep-2009, at 10:16 PM, Claus Guttesen wrote:


I have a big performance problem in my SQL select query:


select * from event where user_id in
(500,499,498,497,496,495,494,493,492,491,490,489,488,487,486,485,484,483,482,481,480,479,478,477,476,475,474,473,472,471,470,469,468,467,466,465,464,463,462,461,460,459,458,457,456,455,454,453,452,451,450,449,448,447,446,445,444,443,442,441,440,439,438,437,436,435,434,433,432,431,430,429,428,427,426,425,424,423,422,421,420,419,418,417,416,415,414,413,412,411,410,409,408,407,406,405,404,403,402,401
,
400,399,398,397,396,395,394,393,392,391,390,389,388,387,386,385,384,383,382,381,380,379,378,377,376,375,374,373,372,371,370,369,368,367,366,365,364,363,362,361,360,359,358,357,356,355,354,353,352,351,350,349,348,347,346,345,344,343,342,341,340,339,338,337,336,335,334,333,332,331,330,329,328,327,326,325,324,323,322,321,320,319,318,317,316,315,314,313,312,311,310,309,308,307,306,305,304,303,302,301
,
300,299,298,297,296,295,294,293,292,291,290,289,288,287,286,285,284,283,282,281,280,279,278,277,276,275,274,273,272,271,270,269,268,267,266,265,264,263,262,261,260,259,258,257,256,255,254,253,252,251,250,249,248,247,246,245,244,243,242,241,240,239,238,237,236,235,234,233,232,231,230,229,228,227,226,225,224,223,222,221,220,219,218,217,216,215,214,213,212,211,210,209,208,207,206,205,204,203,202,201
,
200,199,198,197,196,195,194,193,192,191,190,189,188,187,186,185,184,183,182,181,180,179,178,177,176,175,174,173,172,171,170,169,168,167,166,165,164,163,162,161,160,159,158,157,156,155,154,153,152,151,150,149,148,147,146,145,144,143,142,141,140,139,138,137,136,135,134,133,132,131,130,129,128,127,126,125,124,123,122,121,120,119,118,117,116,115,114,113,112,111,110,109,108,107,106,105,104,103,102,101
,
100,99,98,97,96,95,94,93,92,91,90,89,88,87,86,85,84,83,82,81,80,79,78,77,76,75,74,73,72,71,70,69,68,67,66,65,64,63,62,61,60,59,58,57,56,55,54,53,52,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,34,33,32,31,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0
);



What happens if you change the query to

select * from event where user_id = 0 and user_id = 500;


or select * from event where user_id = 500; :)

Besides, your index seem quite huge 2G, and it usually takes some  
time to process the result, even though it's already indexed with btree.





? :-)

--
regards
Claus

When lenity and cruelty play for a kingdom,
the gentler gamester is the soonest winner.

Shakespeare

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org 
)

To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Alan Hodgson
On Monday 17 November 2008, Dimi Paun [EMAIL PROTECTED] wrote:
 It takes 0.091s (!):
 perpedes_db=# EXPLAIN ANALYZE SELECT * FROM triphistory WHERE ownerId =
 10015 ORDER BY accessTS DESC LIMIT 5; QUERY PLAN
 -
--
--- Limit  (cost=0.00..7.99 rows=5 width=106) (actual
 time=0.024..0.042 rows=5 loops=1) -  Index Scan Backward using
 idx_trip_history_owner_access_ts on triphistory  (cost=0.00..142.20
 rows=89 width=106) (actual time=0.021..0.034 rows=5 loops=1) Index Cond:
 (ownerid = 10015)
  Total runtime: 0.091 ms

That's 0.091 milliseconds (0.91 seconds).


-- 
Corporations will ingest natural resources and defecate garbage until all 
resources are depleted, debt can no longer be repaid and our money becomes 
worthless - Jay Hanson

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Scott Marlowe
On Mon, Nov 17, 2008 at 8:53 AM, Dimi Paun [EMAIL PROTECTED] wrote:
 Hi folks,

 I have a simple table that keeps track of a user's access history.
 It has a a few fields, but the important ones are:
  - ownerId: the user's ID, a int8
  - accessTS: the timestamp of the record

 The table right now is small, only 1942 records.
 The user I test with (10015) has only 89 entries.

 What I want is to get the last 5 accesses of a user:
   SELECT * FROM triphistory WHERE ownerId = 10015 ORDER BY accessTS DESC 
 LIMIT 5

 If I create a composite index *and* analyze:
   create index IDX_TRIP_HISTORY_OWNER_ACCESS_TS on tripHistory (ownerId, 
 accessTS);
   ANALYZE triphistory;

 It takes 0.091s (!):
 perpedes_db=# EXPLAIN ANALYZE SELECT * FROM triphistory WHERE ownerId = 10015 
 ORDER BY accessTS DESC LIMIT 5;

 QUERY PLAN
 --
  Limit  (cost=0.00..7.99 rows=5 width=106) (actual time=0.024..0.042 rows=5 
 loops=1)
   -  Index Scan Backward using idx_trip_history_owner_access_ts on 
 triphistory  (cost=0.00..142.20 rows=89 width=106) (actual time=0.021..0.034 
 rows=5 loops=1)
 Index Cond: (ownerid = 10015)
  Total runtime: 0.091 ms
 (4 rows)


 BTW, this is after several runs of the query, shouldn't all this stuff be in 
 memory?

Are you saying it's excessive you need the compound query?  Cause
that's running in 91microseconds as pointed out by Alan.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Dimi Paun

On Mon, 2008-11-17 at 09:53 -0700, Scott Marlowe wrote:
 
 Are you saying it's excessive you need the compound query?  Cause
 that's running in 91microseconds as pointed out by Alan.

Of course, my bad. I read that as 91ms (blush/).

Confusion came from the fact that pgadminIII reports the query
taking 20-40ms, so I read the 0.091 as seconds not ms.

-- 
Dimi Paun [EMAIL PROTECTED]
Lattica, Inc.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Scott Marlowe
On Mon, Nov 17, 2008 at 10:07 AM, Dimi Paun [EMAIL PROTECTED] wrote:

 On Mon, 2008-11-17 at 09:53 -0700, Scott Marlowe wrote:

 Are you saying it's excessive you need the compound query?  Cause
 that's running in 91microseconds as pointed out by Alan.

 Of course, my bad. I read that as 91ms (blush/).

 Confusion came from the fact that pgadminIII reports the query
 taking 20-40ms, so I read the 0.091 as seconds not ms.

Ahhh.  Keep in mind that if you just run the query, pgadminIII will
tell you how long it took to run AND return all the data across the
network, so it will definitely take longer then.  But most of that's
network io wait so it's not a real issue unless you're saturating your
network.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Dimi Paun

On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote:
 Ahhh.  Keep in mind that if you just run the query, pgadminIII will
 tell you how long it took to run AND return all the data across the
 network, so it will definitely take longer then.  But most of that's
 network io wait so it's not a real issue unless you're saturating your
 network.

But that is brutal -- there's no way it can take 20ms for a request 
across an unloaded network.

Moreover, I got something like this:

pgadminIII | pgsql
w/o index: 45ms  0.620ms
w/ index   20ms  0.091ms

How now I try to replicate, and I get 45ms in both cases. This is
very misleading...

-- 
Dimi Paun [EMAIL PROTECTED]
Lattica, Inc.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Scott Marlowe
On Mon, Nov 17, 2008 at 10:31 AM, Dimi Paun [EMAIL PROTECTED] wrote:

 On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote:
 Ahhh.  Keep in mind that if you just run the query, pgadminIII will
 tell you how long it took to run AND return all the data across the
 network, so it will definitely take longer then.  But most of that's
 network io wait so it's not a real issue unless you're saturating your
 network.

 But that is brutal -- there's no way it can take 20ms for a request
 across an unloaded network.

 Moreover, I got something like this:

pgadminIII | pgsql
 w/o index: 45ms  0.620ms
 w/ index   20ms  0.091ms

 How now I try to replicate, and I get 45ms in both cases. This is
 very misleading...

I'm guessing a fair bit of that time is pgadminIII prettifying the
output for you, etc.  I.e. it's not all transfer time.  Hard to say
without hooking some kind of profiler in pgadminIII.  Is psql running
local and pgadminIII remotely?  Or are they both remote?  If both psql
and pgadminIII are remote (i.e. same basic circumstances) then it's
got to be a difference in the client causing the extra time.  OR is
this output of explain analyze?

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread ries van Twisk


On Nov 17, 2008, at 12:40 PM, Scott Marlowe wrote:


On Mon, Nov 17, 2008 at 10:31 AM, Dimi Paun [EMAIL PROTECTED] wrote:


On Mon, 2008-11-17 at 10:16 -0700, Scott Marlowe wrote:

Ahhh.  Keep in mind that if you just run the query, pgadminIII will
tell you how long it took to run AND return all the data across the
network, so it will definitely take longer then.  But most of that's
network io wait so it's not a real issue unless you're saturating  
your

network.


But that is brutal -- there's no way it can take 20ms for a request
across an unloaded network.

Moreover, I got something like this:

  pgadminIII | pgsql
w/o index: 45ms  0.620ms
w/ index   20ms  0.091ms

How now I try to replicate, and I get 45ms in both cases. This is
very misleading...


I'm guessing a fair bit of that time is pgadminIII prettifying the
output for you, etc.  I.e. it's not all transfer time.  Hard to say
without hooking some kind of profiler in pgadminIII.  Is psql running
local and pgadminIII remotely?  Or are they both remote?  If both psql
and pgadminIII are remote (i.e. same basic circumstances) then it's
got to be a difference in the client causing the extra time.  OR is
this output of explain analyze?



Side note: I haven't seen pgAdminIII never show a time below 20ms (the  
time on the bottom right corner).


When I do a query like this : select 1; it takes according to  
pgAdminIII around 20ms. (whatever that time is)


what I normally do to find my real query time is put and explain  
analyse in front of my query to know to real query time.


Ries







--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Dimi Paun

On Mon, 2008-11-17 at 10:40 -0700, Scott Marlowe wrote:
 I'm guessing a fair bit of that time is pgadminIII prettifying the
 output for you, etc.  I.e. it's not all transfer time.  Hard to say
 without hooking some kind of profiler in pgadminIII.  Is psql running
 local and pgadminIII remotely?  Or are they both remote?  If both psql
 and pgadminIII are remote (i.e. same basic circumstances) then it's
 got to be a difference in the client causing the extra time.  OR is
 this output of explain analyze?

With \timing on I get basically the same output (local vs remote)
in psql (0.668ms vs. 0.760ms). More like it.


WTH is pgadminIII reporting?!?

-- 
Dimi Paun [EMAIL PROTECTED]
Lattica, Inc.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad performance on simple query

2008-11-17 Thread Dave Page
On Mon, Nov 17, 2008 at 6:14 PM, Dimi Paun [EMAIL PROTECTED] wrote:

 On Mon, 2008-11-17 at 10:40 -0700, Scott Marlowe wrote:
 I'm guessing a fair bit of that time is pgadminIII prettifying the
 output for you, etc.  I.e. it's not all transfer time.  Hard to say
 without hooking some kind of profiler in pgadminIII.  Is psql running
 local and pgadminIII remotely?  Or are they both remote?  If both psql
 and pgadminIII are remote (i.e. same basic circumstances) then it's
 got to be a difference in the client causing the extra time.  OR is
 this output of explain analyze?

 With \timing on I get basically the same output (local vs remote)
 in psql (0.668ms vs. 0.760ms). More like it.


 WTH is pgadminIII reporting?!?

Exactly what it's supposed to be, however it's using libpq's
asynchronous query interface and has to pass the query result  through
the wxWidgets event handling system, both of which seem to add a few
milliseconds to the overall query time from the quick testing I've
just done. In a GUI app like pgAdmin, we need use this kind of
architecture to allow the UI to continue processing events (such as
button clicks, redraws etc), and to allow multiple windows to work
independently without one query locking up the whole app.

Note that the rendering time that Tom mentioned the other day which
used to confuse things has not been an issue for a couple of years -
that was dependent on resultset size and could lead to much bigger
variations. that was fixed by having libpq act as a virtual data store
for the UI instead of transferring data from the PGresult to the data
grid's own data store.

I think the bottom line is that you cannot compare psql and pgAdmin's
timings because the architectures of the two apps are very different.
Further, pgAdmin isn't the best choice for micro-optimisation of
extremely fast queries.

-- 
Dave Page
EnterpriseDB UK:   http://www.enterprisedb.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad performance on Solaris 10

2006-04-14 Thread Jignesh K. Shah

Hi Bruce,


I saw even on this alias also that people assumed that the default 
wal_sync_method was fsync on Solaris.


I would select fsync or fdsync as the default on Solaris. (I prefer 
fsync as it is already highlighted as default in postgresql)


Another thing to improve the defaults on Solaris will be to increase the 
defaults of

wal_buffers
and
checkpoint_segments

(I think in 8.1 checkpoint_segments have been already improved to a 
default of 8 from the previous 3 and that may be already some help in 
performance out there. )


These three changes improve out-of-box performance of PostgreSQL quite a 
bit on Solaris (SPARC as well as x64 platforms).


Then you will suddenly see decrease in the number of people PostgreSQL 
community complaining about Solaris 10, as well as Solaris community 
complaining about PostgreSQL. (The benefits are mutual)


Don't get me wrong. As Luke mentioned it took a while to get the 
potential of PostgreSQL on Solaris and people like me start doing other 
complex  workarounds in Solaris like forcedirectio, etc. (Yeah I did a 
test, if you force  fsync as wal_sync_method  while on  Solaris, then 
you may not even require to do forcedirectio of your $PGDATA/pg_xlogs to 
get back the lost performance)


If we had realized that fsync/odatasync difference was the culprit we 
could have saved couple of months of efforts.


Yes I agree that putting OS specific things in PostgreSQL hurts 
community and sticking to POSIX standards help.


Just my two cents.

Regards,
Jignesh


Bruce Momjian wrote:


Jignesh K. Shah wrote:
 


Bruce,

Hard to answer that... People like me who know and love PostgreSQL and  
Solaris finds this as an opportunity to make their favorite database 
work best on their favorite operating system.


Many times PostgreSQL has many things based on assumption that it will 
run on  Linux and it is left to Solaris to emulate that behavior.That 
said there are ways to improve performance even on UFS on Solaris, it 
just requires more tweaks.


Hopefully this will lead to few Solaris friendly default values  like 
fsync/odatasync :-)
   



Yes, if someone wants to give us a clear answer on which wal_sync method
is best on all versions of Solaris, we can easily make that change.

 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] bad performance on Solaris 10

2006-04-14 Thread Josh Berkus
Jignesh,

 Don't get me wrong. As Luke mentioned it took a while to get the
 potential of PostgreSQL on Solaris and people like me start doing other
 complex  workarounds in Solaris like forcedirectio, etc. (Yeah I did a
 test, if you force  fsync as wal_sync_method  while on  Solaris, then
 you may not even require to do forcedirectio of your $PGDATA/pg_xlogs to
 get back the lost performance)

I didn't see these later test results.   Can you link?   

Also, I presume this was on DW, and not on OLTP, yes?

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Bruce Momjian
Jignesh K. Shah wrote:
 
 Bruce,
 
 Hard to answer that... People like me who know and love PostgreSQL and  
 Solaris finds this as an opportunity to make their favorite database 
 work best on their favorite operating system.
 
 Many times PostgreSQL has many things based on assumption that it will 
 run on  Linux and it is left to Solaris to emulate that behavior.That 
 said there are ways to improve performance even on UFS on Solaris, it 
 just requires more tweaks.
 
 Hopefully this will lead to few Solaris friendly default values  like 
 fsync/odatasync :-)

Yes, if someone wants to give us a clear answer on which wal_sync method
is best on all versions of Solaris, we can easily make that change.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Robert Lor

Bruce Momjian wrote On 04/13/06 01:39 AM,:
 
 Yes, if someone wants to give us a clear answer on which wal_sync method
 is best on all versions of Solaris, we can easily make that change.
 

We're doing tests to see how various parameters in postgresql.conf
affect performance on Solaris and will share the results shortly.

Regards,
-Robert


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-13 Thread Merlin Moncure
On 4/12/06, Josh Berkus josh@agliodbs.com wrote:
 People,

  Lately I find people are not so receptive to VxFS, and Sun is promoting
  ZFS, and we don't have a reasonable near term option for Raw IO in
  Postgres, so we need to work to find a reasonable path for Solaris users
  IMO. The long delays in ZFS production haven't helped us there, as the
  problems with UFS are severe.

I just recently worked with sun solaris 10 and found it to be
reasonably performant without much tuning.  This was on a dual sparc
sunblade workstation which i felt was very well engineered.  I was
able (with zero solaris experience) to get postgresql up and crunching
away at some really data intensive tasks while running an application
compiled their very excellent fortran compiler.

In the enterprise world I am finding that the only linux distrubutions
supported are redhat and suse, meaning if you have a problem with your
san running against your gentoo box you have a serious problem.
Solaris OTOH, is generally very well supported (especially on sun
hardware) and is free.  So I give sun great credit for providing a
free if not necessarily completely open platform for developing open
source applications in an enterprise environment.

Merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] bad performance on Solaris 10

2006-04-12 Thread Bruce Momjian
Luke Lonergan wrote:
 Alvaro,
 
 On 4/5/06 2:48 PM, Alvaro Herrera [EMAIL PROTECTED] wrote:
 
  This essentially means stopping all bgwriter activity, thereby deferring
  all I/O until checkpoint.  Was this considered?  With
  checkpoint_segments to 128, it wouldn't surprise me that there wasn't
  any checkpoint executed at all during the whole test ...
 
 Yes, many things about the Solaris UFS filesystem caused a great deal of
 pain over the 10 months of experiments we ran with Sun MDE.  Ultimately, the
 conclusion was that ZFS is going to make all of the pain go away.
 
 In the meantime, all you can do is tweak up UFS and avoid I/O as much as
 possible.

It is hard to imagine why people spend so much time modifying Sun
machines run with acceptable performance when non-Sun operating systems
work fine without such hurtles.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] bad performance on Solaris 10

2006-04-12 Thread Luke Lonergan
Bruce,

On 4/12/06 12:56 PM, Bruce Momjian pgman@candle.pha.pa.us wrote:

 It is hard to imagine why people spend so much time modifying Sun
 machines run with acceptable performance when non-Sun operating systems
 work fine without such hurtles.

There are a lot of Solaris customers that we support and that we'd like to
support.  To many of them, Solaris has many advantages other than speed,
though they expect a reasonably comparable performance, perhaps within a
factor of 2 of other options.

Oracle has spent a great deal of time (a decade!) optimizing their software
for Solaris, and it shows.  There are also some typical strategies that
Solaris people used to use to make Solaris perform better, like using VxFS
(Veritas Filesystem), or Oracle Raw IO to make their systems perform better.

Lately I find people are not so receptive to VxFS, and Sun is promoting ZFS,
and we don't have a reasonable near term option for Raw IO in Postgres, so
we need to work to find a reasonable path for Solaris users IMO.  The long
delays in ZFS production haven't helped us there, as the problems with UFS
are severe.

We at Greenplum have worked hard over the last year to find options for
Postgres on Solaris and have the best configuration setup that we think is
possible now on UFS, and our customers benefit from that.  However, Linux on
XFS or even ext3 is definitely the performance leader.

- Luke 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-12 Thread Josh Berkus
People,

 Lately I find people are not so receptive to VxFS, and Sun is promoting
 ZFS, and we don't have a reasonable near term option for Raw IO in
 Postgres, so we need to work to find a reasonable path for Solaris users
 IMO.  The long delays in ZFS production haven't helped us there, as the
 problems with UFS are severe.

FWIW, I'm testing on ZFS now.  But it's not stable yet.  People are welcome 
to join the Solaris 11 beta program.

In the near term, there are fixes to be made both in PostgreSQL 
configuration and in Solaris configuration.  Also, some of the work being 
done for 8.2 ... the external sort work done by Simon and sponsored by 
GreenPlum, and the internal sort work which Jonah and others are doing ... 
will improve things on Solaris as our sort issues hit Solaris harder than 
other OSes.

Expect lots more info on performance config for Solaris from me  Robert in 
the next few weeks. 

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-12 Thread Jignesh K. Shah


Bruce,

Hard to answer that... People like me who know and love PostgreSQL and  
Solaris finds this as an opportunity to make their favorite database 
work best on their favorite operating system.


Many times PostgreSQL has many things based on assumption that it will 
run on  Linux and it is left to Solaris to emulate that behavior.That 
said there are ways to improve performance even on UFS on Solaris, it 
just requires more tweaks.


Hopefully this will lead to few Solaris friendly default values  like 
fsync/odatasync :-)


Regards,
Jignesh


Bruce Momjian wrote:



It is hard to imagine why people spend so much time modifying Sun
machines run with acceptable performance when non-Sun operating systems
work fine without such hurtles.
 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] bad performance on Solaris 10

2006-04-12 Thread Tom Lane
Jignesh K. Shah [EMAIL PROTECTED] writes:
 Many times PostgreSQL has many things based on assumption that it will 
 run on  Linux and it is left to Solaris to emulate that behavior.

Au contraire --- PG tries its best to be OS-agnostic.  I've personally
resisted people trying to optimize it by putting in Linux-specific
behavior.  The above sounds to me like making excuses for a poor OS.

(And yes, I will equally much resist any requests to put in Solaris-
specific behavior...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-10 Thread Chris Mair

  Chris,
  Just to make sure the x4100 config is similar to your Linux system, can 
  you verify the default setting for disk write cache and make sure they 
  are both enabled or disabled. Here's how to check in Solaris.
  As root, run format -e - pick a disk - cache - write_cache - display
  
  Not sure how to do it on Linux though!
  
  Regards,
  -Robert
 
 I don't have access to the machine for the next few days due to eh...
 let's call it firewall accident ;), but it might very well be that it
 was off on the x4100 (I know it's on the smaller Linux box).
 
 That together with the bad default sync method can definitely explain
 the strangely slow out of box performance I got.
 
 So thanks again for explaining this to me :)
 
 Bye, Chris.

Just for completeness:
I checked now using the above commands and can confirm the write cache
was disabled on the x4100 and was on on Linux. 

Bye, Chris.






---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] bad performance on Solaris 10

2006-04-07 Thread Chris Mair

 Ok, so I did a few runs for each of the sync methods, keeping all the
 rest constant and got this:
 
 open_datasync  0.7
 fdatasync  4.6
 fsync  4.5
 fsync_writethrough not supported
 open_sync  0.6
 
 in arbitrary units - higher is faster.
 
 Quite impressive!
 
 
   
 
 Chris,
 Just to make sure the x4100 config is similar to your Linux system, can 
 you verify the default setting for disk write cache and make sure they 
 are both enabled or disabled. Here's how to check in Solaris.
 As root, run format -e - pick a disk - cache - write_cache - display
 
 Not sure how to do it on Linux though!
 
 Regards,
 -Robert

I don't have access to the machine for the next few days due to eh...
let's call it firewall accident ;), but it might very well be that it
was off on the x4100 (I know it's on the smaller Linux box).

That together with the bad default sync method can definitely explain
the strangely slow out of box performance I got.

So thanks again for explaining this to me :)

Bye, Chris.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Chris Mair

  Yeah - looks good! (is the default open_datasync still?). Might be worth
  trying out the fdatasync method too (ISTR this being quite good... again
  on Solaris 8, so things might have changed)!
 
 I was just talking to a member of the Solaris-UFS team who recommended that 
 we 
 test fdatasync.

Ok, so I did a few runs for each of the sync methods, keeping all the
rest constant and got this:

open_datasync  0.7
fdatasync  4.6
fsync  4.5
fsync_writethrough not supported
open_sync  0.6

in arbitrary units - higher is faster.

Quite impressive!

Bye, Chris.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Luke Lonergan
Chris,

On 4/5/06 2:31 PM, Chris Mair [EMAIL PROTECTED] wrote:

 Doing what http://blogs.sun.com/roller/page/jkshah suggests:
   wal_sync_method = fsync (unchanged)
   wal_buffers = 128 (was 8)
   checkpoint_segments = 128 (was 3)
   bgwriter_all_percent = 0 (was 0.333)
   bgwriter_all_maxpages = 0 (was 5)
 and leaving everything else default (solarispackages from pgfoundry)
 increased performance ~ 7 times!

In the recent past, Jignesh Shaw of Sun MDE discovered that changing the
bgwriter_* parameters to zero had a dramatic positive impact on performance.

There are also some critical UFS kernel tuning parameters to set, you should
find those in his blog.

We found and fixed some libpq issues with Solaris that were also critical -
they should be in 8.1.3 I think.

- Luke



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Josh Berkus
Chris,

 Remounting the fs where $PGDATA lives with forcedirectio
 (together with logging, that is default) did not help
 (if not harm...) performance.

Not all of PG.  JUST pg_xlog.  forcedirectio is only a good idea for the xlog.

 Quickly playing around with wal_buffers on Linux and Mac OS X
 I see it influences the performance of my test a bit, maybe in the
 10-20% range (I'm really doing quick tests, nothing systematic),
 but nowhere near as spectacularly as on Solaris.

 I'm happy so far, but I find it very surprising that this single
 parameter has such an impact (only on) Solaris 10.

That *is* interesting.  I hadn't tested this previously specifically on 
Solaris.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Josh Berkus
Mark, Chris,

 Yeah - looks good! (is the default open_datasync still?). Might be worth
 trying out the fdatasync method too (ISTR this being quite good... again
 on Solaris 8, so things might have changed)!

I was just talking to a member of the Solaris-UFS team who recommended that we 
test fdatasync.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] bad performance on Solaris 10

2006-04-06 Thread Robert Lor

Chris Mair wrote:


Ok, so I did a few runs for each of the sync methods, keeping all the
rest constant and got this:

open_datasync  0.7
fdatasync  4.6
fsync  4.5
fsync_writethrough not supported
open_sync  0.6

in arbitrary units - higher is faster.

Quite impressive!


 


Chris,
Just to make sure the x4100 config is similar to your Linux system, can 
you verify the default setting for disk write cache and make sure they 
are both enabled or disabled. Here's how to check in Solaris.

As root, run format -e - pick a disk - cache - write_cache - display

Not sure how to do it on Linux though!

Regards,
-Robert

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Alvaro Herrera
Luke Lonergan wrote:
 Chris,
 
 On 4/5/06 2:31 PM, Chris Mair [EMAIL PROTECTED] wrote:
 
  Doing what http://blogs.sun.com/roller/page/jkshah suggests:
wal_sync_method = fsync (unchanged)
wal_buffers = 128 (was 8)
checkpoint_segments = 128 (was 3)
bgwriter_all_percent = 0 (was 0.333)
bgwriter_all_maxpages = 0 (was 5)
  and leaving everything else default (solarispackages from pgfoundry)
  increased performance ~ 7 times!
 
 In the recent past, Jignesh Shaw of Sun MDE discovered that changing the
 bgwriter_* parameters to zero had a dramatic positive impact on performance.

This essentially means stopping all bgwriter activity, thereby deferring
all I/O until checkpoint.  Was this considered?  With
checkpoint_segments to 128, it wouldn't surprise me that there wasn't
any checkpoint executed at all during the whole test ...

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Chris Mair

   Doing what http://blogs.sun.com/roller/page/jkshah suggests:
 wal_sync_method = fsync (unchanged)
 wal_buffers = 128 (was 8)
 checkpoint_segments = 128 (was 3)
 bgwriter_all_percent = 0 (was 0.333)
 bgwriter_all_maxpages = 0 (was 5)
   and leaving everything else default (solarispackages from pgfoundry)
   increased performance ~ 7 times!

Ok, so I could quite believe my own benchmarks and I decided
to do a fresh initdb and retry everything.

At first it looked like I coudn't reproduce the speed up I just saw.

Then I realized it was the 
wal_sync_method = fsync
line that makes all the difference!

Normally parameters that are commented are default values, but for
wal_sync_method it actually says (note the comment):

wal_sync_method = fsync  # the default is the first option
 # supported by the operating system:
 #   open_datasync
 #   fdatasync
 #   fsync
 #   fsync_writethrough
 #   open_sync

So Im my last mail I drew the wrong conclusion, because i didn't comment
wal_sync_method to double check.

To the point: the default wal_sync_method choosen on Solaris 10 appears
to be a very bad one - for me, picking fsync increases performance ~
times 7, all other parameters unchanged!

Would it be a good idea to change this in the default install?

Bye, Chris.

PS: yes I did a fresh initdb again to double check ;)


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Mark Kirkwood

Chris Mair wrote:

Hi,

thanks for all replys.

I've done a few tests.

Remounting the fs where $PGDATA lives with forcedirectio
(together with logging, that is default) did not help
(if not harm...) performance.




Sure - forcedirectio on the entire $PGDATA is a definite loss, you only 
want it on $PGDATA/pg_xlog. The usual way this is accomplished is by 
making a separate filsystem for pg_xlog and symlinking from $PGDATA.


Did you try the other option of remounting the fs for $PGDATA without 
logging or forcedirectio?


Cheers

Mark



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Chris Mair
appears this didn't make it to the list... resending to the list
directly...
---

   Doing what http://blogs.sun.com/roller/page/jkshah suggests:
 wal_sync_method = fsync (unchanged)
 wal_buffers = 128 (was 8)
 checkpoint_segments = 128 (was 3)
 bgwriter_all_percent = 0 (was 0.333)
 bgwriter_all_maxpages = 0 (was 5)
   and leaving everything else default (solarispackages from
pgfoundry)
   increased performance ~ 7 times!

Ok, so I could quite believe my own benchmarks and I decided
to do a fresh initdb and retry everything.

At first it looked like I coudn't reproduce the speed up I just saw.

Then I realized it was the 
wal_sync_method = fsync
line that makes all the difference!

Normally parameters that are commented are default values, but for
wal_sync_method it actually says (note the comment):

wal_sync_method = fsync  # the default is the first option
 # supported by the operating system:
 #   open_datasync
 #   fdatasync
 #   fsync
 #   fsync_writethrough
 #   open_sync

So Im my last mail I drew the wrong conclusion, because i didn't comment
wal_sync_method to double check.

To the point: the default wal_sync_method choosen on Solaris 10 appears
to be a very bad one - for me, picking fsync increases performance ~
times 7, all other parameters unchanged!

Would it be a good idea to change this in the default install?

Bye, Chris.

PS: yes I did a fresh initdb again to double check ;)


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Luke Lonergan
Alvaro,

On 4/5/06 2:48 PM, Alvaro Herrera [EMAIL PROTECTED] wrote:

 This essentially means stopping all bgwriter activity, thereby deferring
 all I/O until checkpoint.  Was this considered?  With
 checkpoint_segments to 128, it wouldn't surprise me that there wasn't
 any checkpoint executed at all during the whole test ...

Yes, many things about the Solaris UFS filesystem caused a great deal of
pain over the 10 months of experiments we ran with Sun MDE.  Ultimately, the
conclusion was that ZFS is going to make all of the pain go away.

In the meantime, all you can do is tweak up UFS and avoid I/O as much as
possible.

- Luke  



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Chris Mair

  I've done a few tests.
  
  Remounting the fs where $PGDATA lives with forcedirectio
  (together with logging, that is default) did not help
  (if not harm...) performance.
  
 
 
 Sure - forcedirectio on the entire $PGDATA is a definite loss, you only 
 want it on $PGDATA/pg_xlog. The usual way this is accomplished is by 
 making a separate filsystem for pg_xlog and symlinking from $PGDATA.
 
 Did you try the other option of remounting the fs for $PGDATA without 
 logging or forcedirectio?

not yet, I'm not on the final disk set yet.

when I get there I'll have two separate filesystems for pg_xlog and base
and will try what you suggest.

(but note the other mail about wal_sync_method = fsync)

bye, chris.


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Mark Kirkwood

Chris Mair wrote:



(but note the other mail about wal_sync_method = fsync)



Yeah - looks good! (is the default open_datasync still?). Might be worth 
trying out the fdatasync method too (ISTR this being quite good... again 
on Solaris 8, so things might have changed)!


Cheers

Mark


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-05 Thread Chris Mair
Hi,

thanks for all replys.

I've done a few tests.

Remounting the fs where $PGDATA lives with forcedirectio
(together with logging, that is default) did not help
(if not harm...) performance.

Doing what http://blogs.sun.com/roller/page/jkshah suggests:
  wal_sync_method = fsync (unchanged)
  wal_buffers = 128 (was 8)
  checkpoint_segments = 128 (was 3)
  bgwriter_all_percent = 0 (was 0.333)
  bgwriter_all_maxpages = 0 (was 5)
and leaving everything else default (solarispackages from pgfoundry)
increased performance ~ 7 times!

Playing around with these modifications I find that it's
actually just the
  wal_buffers = 128
alone which makes all the difference!

Quickly playing around with wal_buffers on Linux and Mac OS X
I see it influences the performance of my test a bit, maybe in the
10-20% range (I'm really doing quick tests, nothing systematic),
but nowhere near as spectacularly as on Solaris.

I'm happy so far, but I find it very surprising that this single
parameter has such an impact (only on) Solaris 10.

(my test program is a bulk inserts using PQputCopyData in large
transactions - all test were 8.1.3).

Bye, Chris





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] bad performance on Solaris 10

2006-04-04 Thread Josh Berkus
Mark,

 I suspect that making a *separate* filesystem for the pg_xlog directory
 and mounting that logging + forcedirectio would be a nice way to also
 get performance while keeping the advantages of logging + file
 buffercache for the *rest* of the postgres components.
 Cheers

Yes, we tested this.  It makes a huge difference in WAL speed.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] bad performance on Solaris 10

2006-04-03 Thread Josh Berkus
Chris,

 Eons ago PCs had those turbo switches (it was never totally clear
 why they put them there in the first place, anyway). I've this bad
 feeling there's a secret turbo switch I can't spot hidden somewhere
 in Solaris :/

Yes.   Check out Jignesh's configuration advice  ach, this is down.   
Hold on, I will get you instructions on how to turn on filesystem caching 
and readahead in Solaris.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] bad performance on Solaris 10

2006-04-03 Thread Luke Lonergan
Title: Re: [PERFORM] bad performance on Solaris 10



Jigneshs blog has some of the good stuff in it:
http://blogs.sun.com/roller/page/jkshah

- Luke


On 4/3/06 5:49 PM, Josh Berkus josh@agliodbs.com wrote:

Chris,

 Eons ago PCs had those turbo switches (it was never totally clear
 why they put them there in the first place, anyway). I've this bad
 feeling there's a secret turbo switch I can't spot hidden somewhere
 in Solaris :/

Yes. Check out Jignesh's configuration advice  ach, this is down. 
Hold on, I will get you instructions on how to turn on filesystem caching
and readahead in Solaris.

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings









Re: [PERFORM] bad performance on Solaris 10

2006-04-03 Thread Mark Kirkwood

Chris Mair wrote:

Hi,

I've got a somewhat puzzling performance problem here.

I'm trying to do a few tests with PostgreSQL 8.1.3 under Solaris
(an OS I'm sort of a newbie in).

The machine is a X4100 and the OS is Solaris 10 1/06 fresh install
according to manual. It's got two SAS disks in RAID 1, 4GB of RAM.

Now the problem is: this box is *much* slower than I expect.

I've got a libpg test program that happily inserts data
using PQputCopyData().

It performs an order of magnitude worse than the same thing
on a small Sun (Ultra20) running Linux. Or 4 times slower than
an iBook (sic!) running MacOS X.

So, I've this very bad feeling that there is something basic
I'm missing here.

Following are some stats:

sync; dd; sync show these disks write at 53 MB/s = good.

iostat 1 while my test is running says:

   ttysd0   sd1   sd2   sd5
cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy
wt id
   1   57   0   000   000   00  1809  23   700
1  0 99
   0  235   0   000   000   00  2186 223   141
1  0 99
   0   81   0   000   000   00  2488 251   131
1  0 98
   0   81   0   000   000   00  2296 232   151
0  0 99
   0   81   0   000   000   00  2416 16691
0  0 98
   0   81   0   000   000   00  2528 218   141
1  0 99
   0   81   0   000   000   00  2272 223   151
0  0 99

If I interpret this correctly the disk writes at not more than 2.5
MB/sec while the Opterons do nothing = this is bad.

I've tried both, a hand compile with gcc and the solarispackages
from pgfoundry.org = same result.

Eons ago PCs had those turbo switches (it was never totally clear
why they put them there in the first place, anyway). I've this bad
feeling there's a secret turbo switch I can't spot hidden somewhere
in Solaris :/



I ran across something like this on a Solaris 8, RAID1 system, and 
switching off logging on filesystem containing postgres made a huge 
difference!


Now solaris 8 is ancient history, however see:

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6238533

Apparently there can still be issues with logging without forcedirectio 
(which is the default I think).


I suspect that making a *separate* filesystem for the pg_xlog directory 
and mounting that logging + forcedirectio would be a nice way to also 
get performance while keeping the advantages of logging + file 
buffercache for the *rest* of the postgres components.

Cheers

Mark

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Bad performance with hashjoin

2004-09-11 Thread Tom Lane
Vitaly Belman [EMAIL PROTECTED] writes:
 What am I to do?

Reduce random_page_cost and/or increase effective_cache_size.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]