[PERFORM] Temporary Table

2005-10-25 Thread Christian Paul B. Cosinas








Does Creating Temporary table
in a function and NOT dropping them affects the performance of the database?







I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html




FW: [PERFORM] Used Memory

2005-10-25 Thread Christian Paul B. Cosinas

Here are the configuration of our database server:
port = 5432
max_connections = 300
superuser_reserved_connections = 10
authentication_timeout = 60 
shared_buffers = 48000   
sort_mem = 32168
sync = false

Do you think this is enough? Or can you recommend a better configuration for
my server?

The server is also running PHP and Apache but wer'e not using it
extensively. For development purpose only. 

The database slow down is occurring most of the time (when the memory free
is low) I don't think it has something to do with vacuum. We only have a
full server vacuum once a day.



I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Why Index is not working on date columns.

2005-10-25 Thread Kishore B
Hi All, 

Thank you very much for your help in configuring the database.

Can you guys please take a look at the following query andlet meknow why the index is not considered in the plan?

Here is the extract of the condition string of the query that is taking the transaction_date in index condition:

where (account.id = search_engine.account_fk) and ( account.status = 't' and account.id = '40288a820726362f0107263c55d3') and ( search_engine.id = 
conversion.search_engine_fk and conversion.event_type ='daily_spend' and conversion.tactic = 'PPC' and conversion.transaction_date between '2005-01-01 00:00:00' and '2005-10-31 23:59:59') group by 
account.id;
Plan:
 - Index Scan using conversion_all on conversion (cost=0.00..6.02 rows=1 width=98) Index Cond: (((tactic)::text = 'PPC'::text) AND ((event_type)::text = 'daily_spend'::text) AND (transaction_date = '2005-01-01 00:00:00'::timestamp without time zone) AND (transaction_date = '2005-10-31 23:59:59'::timestamp without time zon (..)


Here is the extract of the condition string of the query that is not taking the transaction_date in index condition:

where ( account.status = 't' and account.id = search_engine.account_fk and account.id = '40288a820726362f0107263c55d3' ) and ( search_engine.id = 
conversion.search_engine_fk and conversion.tactic = 'PPC' and conversion.transaction_date = '2005-01-01 00:00:00' and conversion.transaction_date = '2005-10-31 23:59:59' ) group by account.id
;
Plan:
 - Index Scan using conv_evnt_tact_trans_date_sefk on conversion (cost=0.00..6.02 rows=1 width=132) Index Cond: (((outer.id)::text = (conversion.search_engine_fk)::text) AND ((conversion.tactic)::text = 'PPC'::text))
 Filter: ((transaction_date = '2005-01-01 00:00:00'::timestamp without time zone) AND (transaction_date = '2005-10-31 23:59:59'::timestamp without time zone))

I have the following indexes defined on the columns.
conv_evnt_tact_trans_date_sefk : (search_engine_fk, tactic, event_type, transaction_date);
conv_all : (tactic, event_type, transaction_date);

I am really confused when I saw this plan. In both queries, I am using the same columns in the where condition, but the optimizer is taking different indexes in these two cases.
Second, even though, I have the transaction_date column specified in the second instance, why is it not taking the constraint as index condition?

Thanks in advance.

Thank you,
Kishore.


[PERFORM] Strange planner decision on quite simple select

2005-10-25 Thread Markus Wollny
Hello!

I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...)
with pk on message_id and and a non_unique not_null index on thread_id.
A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows,
the planner estimated a total of 1232530 rows in this table. I've got
pg_autovacuum running on the database and run an additional nightly
VACUUM ANALYZE over it every night.

I've got a few queries of the following type:

select  * 
from PUBLIC.BOARD_MESSAGE 
where THREAD_ID = 3354253 
order byMESSAGE_ID asc 
limit   20 
offset  0; 


There are currently roughly 4500 rows with this thread_id in
BOARD_MESSAGE. Explain-output is like so:

  QUERY PLAN 


-- 
 Limit  (cost=0.00..3927.22 rows=20 width=1148) 
   -  Index Scan using pk_board_message on board_message
(cost=0.00..1100800.55 rows=5606 width=1148) 
 Filter: (thread_id = 3354253) 
(3 rows) 

I didn't have the patience to actually complete an explain analyze on
that one - I cancelled the query on several attempts after more than 40
minutes runtime. Now I fiddled a little with this statement and tried
nudging the planner in the right direction like so:

explain analyze select * from (select  * 
from PUBLIC.BOARD_MESSAGE 
where THREAD_ID = 3354253 
order byMESSAGE_ID asc ) as foo 
limit   20 
offset  0; 
 
QUERY PLAN




-

 Limit  (cost=8083.59..8083.84 rows=20 width=464) (actual
time=1497.455..1498.466 rows=20 loops=1) 
   -  Subquery Scan foo  (cost=8083.59..8153.67 rows=5606 width=464)
(actual time=1497.447..1498.408 rows=20 loops=1) 
 -  Sort  (cost=8083.59..8097.61 rows=5606 width=1148) (actual
time=1497.326..1497.353 rows=20 loops=1) 
   Sort Key: message_id 
   -  Index Scan using nidx_bm_thread_id on board_message
(cost=0.00..7734.54 rows=5606 width=1148) (actual time=0.283..1431.752
rows=4215 loops=1)

 Index Cond: (thread_id = 3354253) 
 Total runtime: 1502.138 ms 

Now this is much more like it. As far as I interpret the explain output,
in the former case the planner decides to just sort the whole table with
it's 1.2m rows by it's primary key on message_id and then filters out
the few thousand rows matching the requested thread_id. In the latter
case, it selects the few thousand rows with the matching thread_id
_first_ and _then_ sorts them according to their message_id. The former
attempt involves sorting of more than a million rows and then filtering
through the result, the latter just uses the index to retrieve a few
thousand rows and sorts those - which is much more efficient.

What's more puzzling is that the results vary somewhat depending on the
overall load situation. When using the first approach without the
subselect, sometimes the planner chooses exactly the same plan as it
does with the second approach - with equally satisfying results in
regard to total execution time; sometimes it does use the first plan and
does complete with a very acceptable execution time, too. But sometimes
(when overall load is sufficiently high, I presume) it just runs and
runs for minutes on end - I've had this thing running for more than one
hour on several occasions until I made some changes to my app which
limits the maximum execution time for a query to no more than 55
seconds.

With this IMHO quite ugly subselect-workaround, performance is
reproducably stable and sufficiently good under either load, so I chose
to stick with it for the time being - but I'd still like to know if I
could have done anything to have the planner choose the evidently better
plan for the first query without such a workaround?

Kind regards

   Markus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Strange planner decision on quite simple select

2005-10-25 Thread Richard Huxton

Markus Wollny wrote:

Hello!

I've got a table BOARD_MESSAGE (message_id int8, thread_id int8, ...)
with pk on message_id and and a non_unique not_null index on thread_id.
A count(*) on BOARD_MESSAGE currently yields a total of 1231171 rows,
the planner estimated a total of 1232530 rows in this table. I've got
pg_autovacuum running on the database and run an additional nightly
VACUUM ANALYZE over it every night.

I've got a few queries of the following type:

select  * 
from PUBLIC.BOARD_MESSAGE 
where THREAD_ID = 3354253 
order byMESSAGE_ID asc 
limit   20 
offset  0; 



There are currently roughly 4500 rows with this thread_id in
BOARD_MESSAGE. Explain-output is like so:

  QUERY PLAN 



-- 
 Limit  (cost=0.00..3927.22 rows=20 width=1148) 
   -  Index Scan using pk_board_message on board_message
(cost=0.00..1100800.55 rows=5606 width=1148) 
 Filter: (thread_id = 3354253) 
(3 rows) 


I didn't have the patience to actually complete an explain analyze on
that one - I cancelled the query on several attempts after more than 40
minutes runtime. Now I fiddled a little with this statement and tried
nudging the planner in the right direction like so:


Hmm - it shouldn't take that long. If I'm reading this right, it's 
expecting to have to fetch 5606 rows to match thread_id=3354253 the 20 
times you've asked for. Now, what it probably doesn't know is that 
thread_id is correlated with message_id quite highly (actually, I don't 
know that, I'm guessing). So - it starts at message_id=1 and works 
along, but I'm figuring that it needs to reach message_id's in the 3-4 
million range to see any of the required thread.


Suggestions:
1. Try ORDER BY thread_id,message_id and see if that nudges things 
your way.
2. Keep #1 and try replacing the index on (thread_id) with 
(thread_id,message_id)


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Outer join query plans and performance

2005-10-25 Thread Rich Doughty

I tried on pgsql-general but got no reply. re-posting here as it's
probably the best place to ask

I'm having some significant performance problems with left join. Can
anyone give me any pointers as to why the following 2 query plans are so
different?


EXPLAIN SELECT *
FROM
tokens.ta_tokens  t  LEFT JOIN
tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
h1.histdate = 'now';


 Nested Loop Left Join  (cost=0.00..68778.43 rows=2215 width=1402)
   -  Nested Loop  (cost=0.00..55505.62 rows=2215 width=714)
 -  Index Scan using idx_tokenhist__histdate on ta_tokenhist h1  
(cost=0.00..22970.70 rows=5752 width=688)
   Index Cond: (histdate = '2005-10-24 13:28:38.411844'::timestamp 
without time zone)
 -  Index Scan using ta_tokens_pkey on ta_tokens t  (cost=0.00..5.64 
rows=1 width=26)
   Index Cond: ((t.token_id)::integer = (outer.token_id)::integer)
   -  Index Scan using fkx_tokenhist__tokens on ta_tokenhist h2  
(cost=0.00..5.98 rows=1 width=688)
 Index Cond: ((outer.token_id)::integer = (h2.token_id)::integer)


Performance is fine for this one and the plan is pretty much as i'd
expect.

This is where i hit a problem.


EXPLAIN SELECT *
FROM
tokens.ta_tokens  t  LEFT JOIN
tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
h2.histdate = 'now';


 Hash Join  (cost=1249148.59..9000709.22 rows=2215 width=1402)
   Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer)
   -  Hash Left Join  (cost=1225660.51..8181263.40 rows=4045106 width=714)
 Hash Cond: ((outer.token_id)::integer = (inner.token_id)::integer)
 -  Seq Scan on ta_tokens t  (cost=0.00..71828.06 rows=4045106 
width=26)
 -  Hash  (cost=281243.21..281243.21 rows=10504921 width=688)
   -  Seq Scan on ta_tokenhist h1  (cost=0.00..281243.21 
rows=10504921 width=688)
   -  Hash  (cost=22970.70..22970.70 rows=5752 width=688)
 -  Index Scan using idx_tokenhist__histdate on ta_tokenhist h2  
(cost=0.00..22970.70 rows=5752 width=688)
   Index Cond: (histdate = '2005-10-24 13:34:51.371905'::timestamp 
without time zone)


I would understand if h2 was joined on h1, but it isn't. It only joins
on t. can anyone give any tips on improving the performance of the second
query (aside from changing the join order manually)?


select version();
   version
--
 PostgreSQL 8.0.3 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050821 
(prerelease) (Debian 4.0.1-6)


Thanks

--

  - Rich Doughty

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Why Index is not working on date columns.

2005-10-25 Thread Tom Lane
Kishore B [EMAIL PROTECTED] writes:
  Can you guys please take a look at the following query and let me know why
 the index is not considered in the plan?

Considered and used are two different things.

The two examples you give have the same estimated cost (within two
decimal places) so the planner sees no particular reason to choose one
over the other.

I surmise that you are testing on toy tables and extrapolating to what
will happen on larger tables.  This is an unjustified assumption.
Create a realistic test data set, ANALYZE it, and then see if the
planner chooses indexes you like.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Strange planner decision on quite simple select

2005-10-25 Thread Markus Wollny
 
Hi!

 -Ursprüngliche Nachricht-
 Von: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Im Auftrag 
 von Richard Huxton
 Gesendet: Dienstag, 25. Oktober 2005 12:07
 An: Markus Wollny
 Cc: pgsql-performance@postgresql.org
 Betreff: Re: [PERFORM] Strange planner decision on quite simple select
 
 Hmm - it shouldn't take that long. If I'm reading this right, 
 it's expecting to have to fetch 5606 rows to match 
 thread_id=3354253 the 20 times you've asked for. Now, what it 
 probably doesn't know is that thread_id is correlated with 
 message_id quite highly (actually, I don't know that, I'm 
 guessing). So - it starts at message_id=1 and works along, 
 but I'm figuring that it needs to reach message_id's in the 
 3-4 million range to see any of the required thread.

Reading this I tried with adding a AND MESSAGE_ID = THREAD_ID to the 
WHERE-clause, as you've guessed quite correctly, both message_id and thread_id 
are derived from the same sequence and thread_id equals the lowest message_id 
in a thread. This alone did quite a lot to improve things - I got stable 
executing times down from an average 12 seconds to a mere 2 seconds - just 
about the same as with the subselect.

 Suggestions:
 1. Try ORDER BY thread_id,message_id and see if that nudges 
 things your way.
 2. Keep #1 and try replacing the index on (thread_id) with
 (thread_id,message_id)

Did both (though adding such an index during ordinary workload took some time 
as did the VACUUM ANALYZE afterwards) and that worked like a charm - I've got 
execution times down to as little as a few milliseconds - wow! Thank you very 
much for providing such insightful hints!

Kind regards

   Markus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PERFORM] insertion of bytea

2005-10-25 Thread Chris Mair
Hi,

I have the following test setup:

* PG 8.0.4 on Linux (Centos 4) compiled from source.

* DB schema: essentially one table with a few int columns and
  one bytea column that stores blobs of 52000 bytes each, a
  primary key on one of the int columns.

* A test client was written in C using libpq to see what rate
  can be reached (inserting records). The client uses a
  prepared tatement and bundles n inserts into a single
  transaction (n is variable for testing).

* Hardware: different setups tested, in particular a
  single-opteron box with a built in SATA disk and also an
  array of SATA disks connected via FC.

From the test run it appears that the insert rate here is
essentially CPU bound. I'm getting about 11 MB/s net transfer,
regardless if I use the built in disk or the much faster
array and regardless various settings (like n, shared_mem).

vmstat says that disk bo is about 30MB/s (the array can do much
better, I tried with dd and sync!) while the CPU is maxed out
at about 90% us and 10% sy. The client accounts for just 2% CPU,
most goes into the postmaster.

The client inserts random data. I found out that I can improve
things by 35% if I use random sequences of bytes that are
in the printable range vs. full range.


Question 1:
Am I correct in assuming that even though I'm passing my 52000
bytes as a (char *) to PQexecPrepared(), encoding/decoding is
happening (think 0 - \000) somewhere in the transfer?


Question 2:
Is there a better, faster way to do these inserts?
I'm unsure about large objects. I'm planning to use some
custom server side functions to do computations on the bytes
in these records and the large objects API doesn't appear
to be well suited for this.


Sidequestion:
I've tried to profile the server using CFLAGS=-p -DLINUX_PROFILE.
I'm getting profiling output but when I look at it using
gprof bin-somewhere/postgres $PGDATA/gmon.out I'm only seeing
what I think are the calls for the server startup. How can I profile
the (forked) process that actually performs all the work on
my connection?


Sorry for the long post :)
Bye,
Chris.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] Reindex - Is this necessary after a vacuum?

2005-10-25 Thread Markus Benne
We are reindexing frequently, and I'm wondering if
this is really necessary, given that it appears to
take an exclusive lock on the table.

Our table in question is vacuumed every 4 minutes, and
we are reindexing after each one.

I'm not a fan of locking this table that frequently,
even if it is only for 5 - 10 seconds depending on
load.

The vacuum is a standard vacuum.  Nightly we do a
vacuum analyze.

Thanks for any tips,
...Markus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] insertion of bytea

2005-10-25 Thread Michael Stone

On Tue, Oct 25, 2005 at 03:44:36PM +0200, Chris Mair wrote:

Is there a better, faster way to do these inserts?


COPY is generally the fastest way to do bulk inserts (see
PQputCopyData). 


Mike Stone

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Outer join query plans and performance

2005-10-25 Thread Tom Lane
Rich Doughty [EMAIL PROTECTED] writes:
 EXPLAIN SELECT *
 FROM
  tokens.ta_tokens  t  LEFT JOIN
  tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
  tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
 WHERE
  h1.histdate = 'now';

 EXPLAIN SELECT *
 FROM
  tokens.ta_tokens  t  LEFT JOIN
  tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
  tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
 WHERE
  h2.histdate = 'now';

The reason these are different is that the second case constrains only
the last-to-be-joined table, so the full cartesian product of t and h1
has to be formed.  If this wasn't what you had in mind, you might be
able to rearrange the order of the LEFT JOINs, but bear in mind that
in general, changing outer-join ordering changes the results.  (This
is why the planner won't fix it for you.)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Reindex - Is this necessary after a vacuum?

2005-10-25 Thread Richard Huxton

Markus Benne wrote:

We are reindexing frequently, and I'm wondering if
this is really necessary, given that it appears to
take an exclusive lock on the table.

Our table in question is vacuumed every 4 minutes, and
we are reindexing after each one.

I'm not a fan of locking this table that frequently,
even if it is only for 5 - 10 seconds depending on
load.

The vacuum is a standard vacuum.  Nightly we do a
vacuum analyze.


At most I'd do a nightly reindex. And in fact, I'd probably drop the 
index, full vacuum, recreate index.


But you only need to reindex at all if you have a specific problem with 
the index bloating. Are you seeing this?


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] insertion of bytea

2005-10-25 Thread Tom Lane
Chris Mair [EMAIL PROTECTED] writes:
 Am I correct in assuming that even though I'm passing my 52000
 bytes as a (char *) to PQexecPrepared(), encoding/decoding is
 happening (think 0 - \000) somewhere in the transfer?

Are you specifying it as a text or binary parameter?  Have you looked to
see if the stored data is what you expect?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Reindex - Is this necessary after a vacuum?

2005-10-25 Thread Tom Lane
Markus Benne [EMAIL PROTECTED] writes:
 Our table in question is vacuumed every 4 minutes, and
 we are reindexing after each one.

That's pretty silly.  You might need a reindex once in awhile, but
not every time you vacuum.

The draft 8.1 docs contain some discussion of possible reasons for
periodic reindexing:
http://developer.postgresql.org/docs/postgres/routine-reindex.html
but none of these reasons justify once-per-vacuum reindexes.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Why Index is not working on date columns.

2005-10-25 Thread Kishore B
Hi Tom,

Thank you foryour response.

I surmise that you are testing on toy tables and extrapolating to whatwill happen on larger tables.

These tables participating here contain more than 8 million records as of now, and on every day, 200K records, will add to them.


Thank you,
Kishore.


On 10/25/05, Tom Lane [EMAIL PROTECTED] wrote:
Kishore B [EMAIL PROTECTED] writes:Can you guys please take a look at the following query and let me know why
 the index is not considered in the plan?Considered and used are two different things.The two examples you give have the same estimated cost (within twodecimal places) so the planner sees no particular reason to choose one
over the other.I surmise that you are testing on toy tables and extrapolating to whatwill happen on larger tables.This is an unjustified assumption.Create a realistic test data set, ANALYZE it, and then see if the
planner chooses indexes you like. regards, tom lane


Re: [PERFORM] Inefficient escape codes.

2005-10-25 Thread Rodrigo Madera
Ok, thanks for the limits info, but I have that in the manual. Thanks.

But what I really want to know is this:

1) All large objects of all tables inside one DATABASE is kept on only one table. True or false?

Thanks =o)
RodrigoOn 10/25/05, Nörder-Tuitje, Marcus [EMAIL PROTECTED] wrote:







oh, 
btw, no harm, but : 

having 
5000 tables only to gain access via city name is a major design 
flaw.

you 
might consider puttingall into one table working with a distributed index 
over yer table (city, loc_texdt, blobfield); creating a partitioned index over 
city.

best 
regards

  -Ursprüngliche Nachricht-Von: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]]Im Auftrag von Rodrigo 
  MaderaGesendet: Montag, 24. Oktober 2005 21:12An: 
  pgsql-performance@postgresql.orgBetreff: Re: [PERFORM] Inefficient 
  escape codes.Now this interests me a lot.Please 
  clarify this:I have 5000 tables, one for each 
  city:City1_Photos, City2_Photos, ... City5000_Photos.Each of 
  these tables are: CREATE TABLE CityN_Photos (location text, lo_id 
  largeobectypeiforgot)So, what's the limit for these large objects? I 
  heard I could only have 4 billion records for the whole database (not for each 
  table). Is this true? If this isn't true, then would postgres manage to create 
  all the large objects I ask him to?Also, this would be a performance 
  penalty, wouldn't it?Much thanks for the knowledge shared,Rodrigo
  




Re: [PERFORM] impact of stats_command_string

2005-10-25 Thread Michael Fuhr
On Thu, Oct 20, 2005 at 01:33:07PM -0700, [EMAIL PROTECTED] wrote:
 If I turn on stats_command_string, how much impact would it have on
 PostgreSQL server's performance during a period of massive data
 INSERTs?

Do you really need to be doing massive data INSERTs?  Can you use
COPY, which is much more efficient for bulk loads?

http://www.postgresql.org/docs/8.0/interactive/populate.html

-- 
Michael Fuhr

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] impact of stats_command_string

2005-10-25 Thread Merlin Moncure
 If I turn on stats_command_string, how much impact would it have on
 PostgreSQL server's performance during a period of massive data
 INSERTs?  I know that the answer to the question I'm asking will
 largely depend upon different factors so I would like to know in which
 situations it would be negligible or would have a signifcant impact.

First of all, we have to assume your writes are buffered in some way or
you are using transactions, or you will likely be i/o bound (or you have
a super fast disk setup).

Assuming that, I can tell you from experience on win32 that
stats_command_string can be fairly expensive for certain types of access
patterns.  What patterns?

1. If your ratio of queries to records involved is low.
2. If you are accessing data in a very quick way, for example via
prepared statements over a LAN
3. Your volume of queries is very high.

In these cases, the cost is high.  stats_command_string can add a
fractional millisecond ( ~.2  in my setup ) to statement latency and as
much as double cpu time in extreme cases...you are warned.  You may want
to turn it off before doing bulk loads or lengthy record iterations.

Merlin

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Why Index is not working on date columns.

2005-10-25 Thread Tom Lane
Kishore B [EMAIL PROTECTED] writes:
 I surmise that you are testing on toy tables and extrapolating to what
 will happen on larger tables.
 
 These tables participating here contain more than 8 million records as of
 now, and on every day, 200K records, will add to them.

In that case, have you ANALYZEd the tables lately?  The planner's cost
estimates correspond to awfully small tables ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Outer join query plans and performance

2005-10-25 Thread Rich Doughty

Tom Lane wrote:

Rich Doughty [EMAIL PROTECTED] writes:


EXPLAIN SELECT *
FROM
tokens.ta_tokens  t  LEFT JOIN
tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
h1.histdate = 'now';




EXPLAIN SELECT *
FROM
tokens.ta_tokens  t  LEFT JOIN
tokens.ta_tokenhist   h1 ON t.token_id = h1.token_id LEFT JOIN
tokens.ta_tokenhist   h2 ON t.token_id = h2.token_id
WHERE
h2.histdate = 'now';



The reason these are different is that the second case constrains only
the last-to-be-joined table, so the full cartesian product of t and h1
has to be formed.  If this wasn't what you had in mind, you might be
able to rearrange the order of the LEFT JOINs, but bear in mind that
in general, changing outer-join ordering changes the results.  (This
is why the planner won't fix it for you.)


FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these queries
in approximately 3 seconds. postgres does the first in 6 seconds and the
second in a lot longer (eventually abandoned).


--

  - Rich Doughty

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


[PERFORM] Why different execution times for different instances for the same query?

2005-10-25 Thread Kishore B
Hi All, 

We are executing a single query that returned veryfast on the first instance. Butwhen I executed the same queryformultiple times,it is giving strange results. It is not coming back.

When I checked with the processes running in the system, Iobserved that multiple instances of postmaster are running and all of them are consuming very high amounts of memory. I could also observe that they are sharing the memory in a uniform distribution across them. 


Please let me know if any body has experienced the same and how do they resolved it.

Thank you,
Kishore.


Re: [PERFORM] Why different execution times for different

2005-10-25 Thread Scott Marlowe
On Tue, 2005-10-25 at 14:46, Kishore B wrote:
 Hi All, 
  
 We are executing a single query that returned very fast on the first
 instance. But when I executed the same query for multiple times, it is
 giving strange results. It is not coming back. 
  
 When I checked with the processes running in the system, I observed
 that multiple instances of postmaster are running and all of them are
 consuming very high amounts of memory. I could also observe that they
 are sharing the memory in a uniform distribution across them. 
  
 Please let me know if any body has experienced the same and how do
 they resolved it.

You may or may not have an actual problem.

For one, if they're each using 128 megs, but sharing 120 megs of that
then that's not too bad.  If they're each using 512 meg and sharing 100
meg of that, then you've got a problem.

What is your sort mem set to?  Going too high can cause memory
starvation and other problems.

Also, when you run top, how much memory is being used for cache and
buffer.  If you've still got a fair amount used for cache then you're
probably ok there.

What are your settings in postgresql.conf that aren't default?  How's
the behaviour as you run 1, then 2, then 3, then 4 and so on?  Where's
the knee with this behaviour and what are you running out of, disk IO
or memory or memory bandwidth.

Are you using iostat/vmstat/top/free/ipcs to check resource usage under
load?

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] insertion of bytea

2005-10-25 Thread Chris Mair
Is there a better, faster way to do these inserts?

 COPY is generally the fastest way to do bulk inserts (see
 PQputCopyData).

Thanks :)
I'll give that I try and report the results here later.

Bye, Chris.




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] insertion of bytea

2005-10-25 Thread Chris Mair
 Am I correct in assuming that even though I'm passing my 52000
 bytes as a (char *) to PQexecPrepared(), encoding/decoding is
 happening (think 0 - \000) somewhere in the transfer?

 Are you specifying it as a text or binary parameter?  Have you looked to
 see if the stored data is what you expect?

I'm specifying it as binary (i.e. one's in PQexecPrepared's
format parameter). The stored data is correct.

I'll try copy from stdin with binary tomorrow and see what
I get...

Thanks  Bye, Chris.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Outer join query plans and performance

2005-10-25 Thread Tom Lane
Rich Doughty [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The reason these are different is that the second case constrains only
 the last-to-be-joined table, so the full cartesian product of t and h1
 has to be formed.  If this wasn't what you had in mind, you might be
 able to rearrange the order of the LEFT JOINs, but bear in mind that
 in general, changing outer-join ordering changes the results.  (This
 is why the planner won't fix it for you.)

 FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these queries
 in approximately 3 seconds.

Does mysql get the correct answer, though?  It's hard to see how they do
this fast unless they (a) are playing fast and loose with the semantics,
or (b) have very substantially more analysis logic for OUTER JOIN semantics
than we do.  Perhaps mysql 5.x is better about this sort of thing, but
for 4.x I'd definitely find theory (a) more plausible than (b).

The cases that would be interesting are those where rearranging the
outer join order actually does change the correct answer --- it may not
in this particular case, I haven't thought hard about it.  It seems
fairly likely to me that they are rearranging the join order here, and
I'm just wondering whether they have the logic needed to verify that
such a transformation is correct.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Outer join query plans and performance

2005-10-25 Thread Kevin Grittner
In this particular case both outer joins are to the same table, and
the where clause is applied to one or the other, so it's pretty easy
to prove that they should generate identical results.  I'll grant that
this is not generally very useful; but then, simple test cases often
don't look very useful.

We've had mixed results with PostgreSQL and queries with
multiple outer joins when the WHERE clause limits the results
based on columns from the optional tables.  In at least one case
which performs very well, we have enough tables to cause the
genetic optimizer to kick in.  (So I suppose there is a chance
that sometimes it won't perform well, although we haven't seen
that happen yet.)

I can't speak to MySQL, but both Sybase and MaxDB handled
such cases accurately, and chose a plan with very fast
execution.  Sybase, however, spent 5 to 10 seconds in the
optimizer finding the sub-second plan.

-Kevin


 Tom Lane [EMAIL PROTECTED]  
Rich Doughty [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 The reason these are different is that the second case constrains
only
 the last-to-be-joined table, so the full cartesian product of t and
h1
 has to be formed.  If this wasn't what you had in mind, you might be
 able to rearrange the order of the LEFT JOINs, but bear in mind that
 in general, changing outer-join ordering changes the results.  (This
 is why the planner won't fix it for you.)

 FWIW mysql 4.1 (and i'm no fan at all of mysql) completes both these
queries
 in approximately 3 seconds.

Does mysql get the correct answer, though?  It's hard to see how they do
this fast unless they (a) are playing fast and loose with the semantics,
or (b) have very substantially more analysis logic for OUTER JOIN
semantics
than we do.  Perhaps mysql 5.x is better about this sort of thing, but
for 4.x I'd definitely find theory (a) more plausible than (b).

The cases that would be interesting are those where rearranging the
outer join order actually does change the correct answer --- it may not
in this particular case, I haven't thought hard about it.  It seems
fairly likely to me that they are rearranging the join order here, and
I'm just wondering whether they have the logic needed to verify that
such a transformation is correct.

regards, tom lane


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Temporary Table

2005-10-25 Thread Christian Paul B. Cosinas
I am creating a temporary table in every function that I execute.
Which I think is bout 100,000 temporary tables a day.

What is the command for vacuuming these 3 tables?

Also I read about the auto vacuum of postgresql.
How can I execute this auto vacuum or the settings in the configuration?

-Original Message-
From: Alvaro Nunes Melo [mailto:[EMAIL PROTECTED]
Sent: Tuesday, October 25, 2005 10:58 AM
To: Christian Paul B. Cosinas
Subject: Re: [PERFORM] Temporary Table

Hi Christian,

Christian Paul B. Cosinas wrote:

 Does Creating Temporary table in a function and NOT dropping them 
 affects the performance of the database?

I believe it will depend on how many temporary tables you will create in a
daily basis. We had a performance problem caused by them, and by not
monitoring properly the database size. The pg_attribite, pg_class and
pg_depend tables grow a lot. When I found out that this was the problem I
saw some messages in the list archieve, and now the overall performance is
great.

What I do is daily run VACUUM FULL and REINDEX in this three tables.

Alvaro


I choose Polesoft Lockspam to fight spam, and you?
http://www.polesoft.com/refer.html


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] blue prints please

2005-10-25 Thread Sidar López Cruz

where can i find bests practices for tunning postgresql?

_
Consigue aquí las mejores y mas recientes ofertas de trabajo en América 
Latina y USA: http://latam.msn.com/empleos/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] zero performance on query

2005-10-25 Thread Sidar López Cruz

what happend with postgresql 8.1b4 performance on query?
please help me !!!

look at this:
select count(*) from fotos where archivo not in (select archivo from 
archivos)

Aggregate  (cost=4899037992.36..4899037992.37 rows=1 width=0)
-  Seq Scan on fotos  (cost=22598.78..4899037338.07 rows=261716 width=0)
  Filter: (NOT (subplan))
  SubPlan
-  Materialize  (cost=22598.78..39304.22 rows=805344 width=58)
  -  Seq Scan on archivos  (cost=0.00..13141.44 rows=805344 
width=58)


I WILL DIE WAITING FOR QUERY RESPONSE !!!
--
CREATE TABLE archivos ( archivo varchar(20)) WITHOUT OIDS;
CREATE INDEX archivos_archivo_idx  ON archivos  USING btree(archivo);
~80 rows
--
CREATE TABLE fotos
(
cedula varchar(20),
nombre varchar(100),
apellido1 varchar(100),
apellido2 varchar(100),
archivo varchar(20)
) WITHOUT OIDS;
CREATE INDEX fotos_archivo_idx  ON fotos  USING btree (archivo);
CREATE INDEX fotos_cedula_idx   ON fotos   USING btree (cedula);
~50 rows

_
Consigue aquí las mejores y mas recientes ofertas de trabajo en América 
Latina y USA: http://latam.msn.com/empleos/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster