Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread John Arbash Meinel
Jean-Pierre Pelletier wrote:
 Hi,
 
 I've got many queries running much slower on 8.1 beta2 than on 8.0.1
 Here is a simplified one that takes 484 ms on 8.1 and 32 ms on 8.0.1.
 
 select
   0
 from
   Content C
 
   left outer join Supplier S
   on  C.SupplierId = S.SupplierId
 
   left outer join Price P
   on C.PriceId = P.PriceId;
 
 Any ideas why it's slower?

You really have to post the results of EXPLAIN ANALYZE not just
explain. So that we can tell what the planner is expecting, versus what
really happened.

John
=:-

 
 Thanks
 Jean-Pierre Pelletier
 e-djuster
 


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Queries 15 times slower on 8.1 beta 2 than on 8.0

2005-09-22 Thread John Arbash Meinel
Jean-Pierre Pelletier wrote:
 Here are the explain analyze:

What is the explain analyze if you use set enable_seqscan to off?

Also, can you post the output of:
\d supplier
\d price
\d content

Mostly I just want to see what the indexes are, in the case that you
don't want to show us your schema.

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] extremly low memory usage

2005-08-18 Thread John Arbash Meinel
Jeremiah Jahn wrote:

here's an example standard query. Ireally have to make the first hit go
faster. The table is clustered as well on full_name as well. 'Smith%'
took 87 seconds on the first hit. I wonder if I set up may array wrong.
I remeber see something about DMA access versus something else, and
choose DMA access. LVM maybe?
  

It would be nice if you would format your queries to be a little bit
easier to read before posting them.
However, I believe I am reading it correctly, to say that the index scan
on identity is not your slow point. In fact, as near as I can tell, it
only takes 52ms to complete.

The expensive parts are the 4915 lookups into the litigant_details (each
one takes approx 4ms for a total of ~20s).
And then you do it again on case_data (average 3ms each * 4906 loops =
~15s).

So there is no need for preloading your indexes on the identity table.
It is definitely not the bottleneck.

So a few design bits, which may help your database.
Why is actor_id a text field instead of a number?
You could try creating an index on litigant_details (actor_id,
count_ori) so that it can do just an index lookup, rather than an index
+ filter.

More importantly, though, the planner seems to think the join of
identity to litigant_details will only return 1 row, not 5000.
Do you regularly vacuum analyze your tables?
Just as a test, try running:
set enable_nested_loop to off;
And then run EXPLAIN ANALYZE again, just to see if it is faster.

You probably need to increase some statistics targets, so that the
planner can design better plans.

  -  Nested Loop  (cost=0.00..20411.83 rows=2 width=173)
 (actual time=12.891..38317.017 rows=4906 loops=1)
-  Nested Loop  (cost=0.00..20406.48 rows=1 width=159)
 (actual time=12.826..23232.106 rows=4906 loops=1)
  -  Nested Loop  (cost=0.00..20403.18 rows=1
 width=138) (actual time=12.751..22885.439 rows=4906 loops=1)
Join Filter: ((outer.case_id)::text =
 (inner.case_id)::text)
-  Index Scan using name_speed on
 identity  (cost=0.00..1042.34 rows=4868 width=82) (actual
 time=0.142..52.538 rows=4915 loops=1)
  Index Cond: (((full_name)::text =
 'MILLER'::character varying) AND ((full_name)::text 
 'MILLES'::character varying))
  Filter: (((court_ori)::text =
 'IL081025J'::text) AND ((full_name)::text ~~ 'MILLER%'::text))
-  Index Scan using lit_actor_speed on
 litigant_details  (cost=0.00..3.96 rows=1 width=81) (actual
 time=4.631..4.635 rows=1 loops=4915)
  Index Cond: ((outer.actor_id)::text
 = (litigant_details.actor_id)::text)
  Filter: ('IL081025J'::text =
 (court_ori)::text)
  -  Seq Scan on court  (cost=0.00..3.29 rows=1
 width=33) (actual time=0.053..0.062 rows=1 loops=4906)
Filter: ('IL081025J'::text = (id)::text)
-  Index Scan using case_speed on case_data 
 (cost=0.00..5.29 rows=3 width=53) (actual time=3.049..3.058 rows=1
 loops=4906)
  Index Cond: (('IL081025J'::text =
 (case_data.court_ori)::text) AND ((case_data.case_id)::text =
 (outer.case_id)::text))


John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How many views is ok?

2005-08-14 Thread John Arbash Meinel
Petr Kavan wrote:

 I have database of company data, and some of them is table of
 information about employees. I need each employee to have access only
 to his own row. Postgre cannot do this by system of privileges,
 because that can give privileges only to whole tables.

 Possibility is to create a view for each employee that chooses only
 his data and give employee privileges to this view. But I am not sure
 if such number of views does not have some performance drawbacks or
 even if postgre can support it (I expect i can). I would need several
 tables protected like this and it can result in, say 1000 views in
 maximum.

 Because access to DB will go through PHP information system, other
 possibility to protect data is to let IS connect as more privileged
 than user really is, but let it retrieve only data for that user.

 View-approach seems far more clear than this, but im not sure if
 postgre can handle it without problems.

We do a similar thing tying user to per-row permissions. We have 1 view
per table, and it works fine.
I would recommend that you do something similar. Basically, just make
the view:

CREATE VIEW just_me SECURITY DEFINER AS
SELECT * FROM user_table WHERE username=session_user;
REVOKE ALL FROM user_table;
GRANT SELECT TO just_me TO PUBLIC;

security definer, means that the 'just_me' view will be executed as the
user who created the function (superuser).
The REVOKE ALL (my syntax might be wrong) prevents users from querying
the user tables directly.
The 'session_user' makes the view use the name of the actual connected
user (because of security definer, plain 'user' is the superuser)
This should allow a user to see only their own row in the database.
(Whichever rows that have username matching the connected name).

Now, this only works if the php front end connects specifically as the
given user (our system is setup to do this).

If you don't do it this way, you'll always be stuck with the IS layer
doing the restriction. Even if you create a view per user, if your PHP
layer has the right to look at other tables/views, it doesn't really help.

Good luck,
John
=:-


 Thanks for any reply :-)

 ---
 Petr Kavan
 Database Development



 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance pb vs SQLServer.

2005-08-14 Thread John Arbash Meinel
Stéphane COEZ wrote:

Hi,

I have a perfomance issue :

I run PG (8.0.3) and SQLServer2000 on a Windows2000 Server (P4 1,5Ghz 512Mo)
I have a table (320 rows) and I run this single query :

select cod from mytable group by cod
I have an index on cod (char(4) - 88 different values)

PG = ~ 20 sec.
SQLServer =  8 sec


the explain is :

HashAggregate  (cost=64410.09..64410.09 rows=55 width=8)
  -  Seq Scan on mytable  (cost=0.00..56325.27 rows=3233927 width=8)


if I switch to enable_hashagg = false (just for a try...)
the planner will choose my index :

Group  (cost=0.00..76514.01 rows=55 width=8)
  -  Index Scan using myindex on mytable  (cost=0.00..68429.20 rows=3233927
width=8)

but performance will be comparable to previous test.

So with or without using Index I have the same result.
  


My guess is that this is part of a larger query. There isn't really much
you can do. If you want all 3.2M rows, then you have to wait for them to
be pulled in.

What you generally can do for performance, is to restructure things, so
that you *don't* have to touch all 3.2M rows.
If you are just trying to determine what the unique entries are for cod,
you probably are better off doing some normalization, and keeping a
separate table of cod values.

I'm guessing the reason your query is faster with SQLServer is because
of how postgres handles MVCC. Basically, it still has to fetch the main
page to determine if a row exists. While SQL server doesn't do MVCC, so
it can just look things up in the index.

You might also try a different query, something like:

SELECT DISTINCT cod FROM mytable ORDER BY cod GROUP BY cod;
(You may or may not want order by, or group by, try the different
combinations.)
It might be possible to have the planner realize that all you want is
unique rows, just doing a group by doesn't give you that.

John
=:-


Thanks for help.
 
Stéphane COEZ




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

  





signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance pb vs SQLServer.

2005-08-14 Thread John Arbash Meinel
Steinar H. Gunderson wrote:

On Sun, Aug 14, 2005 at 07:27:38PM -0500, John Arbash Meinel wrote:
  

My guess is that this is part of a larger query. There isn't really much
you can do. If you want all 3.2M rows, then you have to wait for them to
be pulled in.



To me, it looks like he'll get 88 rows, not 3.2M. Surely we must be able to
do something better than a full sequential scan in this case?

test=# create table foo ( bar char(4) );
CREATE TABLE
test=# insert into foo values ('');
INSERT 24773320 1
test=# insert into foo values ('');
INSERT 24773321 1
test=# insert into foo values ('');
INSERT 24773322 1
test=# select * from foo group by bar;
 bar  
--
 
 
(2 rows)

I considered doing some odd magic with generate_series() and subqueries with
LIMIT 1, but it was a bit too weird in the end :-)

/* Steinar */
  

I think a plain GROUP BY is not smart enough to detect it doesn't need
all rows (since it is generally used because you want to get aggregate
values of other columns).
I think you would want something like SELECT DISTINCT, possibly with an
ORDER BY rather than a GROUP BY (which was my final suggestion).

John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Planner incorrectly choosing seq scan over index scan

2005-08-01 Thread John Arbash Meinel
Meetesh Karia wrote:
 Hi all,
 
 We're using 8.0.3 and we're seeing a problem where the planner is
 choosing a seq scan and hash join over an index scan.  If I set
 enable_hashjoin to off, then I get the plan I'm expecting and the query
 runs a lot faster.  I've also tried lowering the random page cost (even
 to 1) but the planner still chooses to use the hash join.
 
 Does anyone have any thoughts/suggestions?  I saw that there was a
 thread recently in which the planner wasn't correctly estimating the
 cost for queries using LIMIT.  Is it possible that something similar is
 happening here (perhaps because of the sort) and that the patch Tom
 proposed would fix it?
 
 Thanks.  Here are the various queries and plans:
 
 Normal settings

...

 QUERY PLAN
 Sort  (cost=13430.57..13439.24 rows=3467 width=48) (actual
 time=1390.000..1390.000 rows=3467 loops=1)
   Sort Key: c.sourceid, c.targetid
   -  Merge Join  (cost=9912.07..13226.72 rows=3467 width=48) (actual
 time=1344.000..1375.000 rows=3467 loops=1)
 Merge Cond: (outer.user_id = inner.sourceid)
 -  Index Scan using lte_user_pkey on lte_user s 
 (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..95.000
 rows=50034 loops=1)

This is where the planner is messing up, and mis-estimating the
selectivity. It is expecting to get 280k rows, but only needs to get 50k.
I assume lte_user is the bigger table, and that candidates617004 has
some subset.

Has lte_user and candidates617004 been recently ANALYZEd? All estimates,
except for the expected number of rows from lte_user seem to be okay.

Is user_id the primary key for lte_user?
I'm trying to figure out how you can get 50k rows, by searching a
primary key, against a 3.5k rows. Is user_id only part of the primary
key for lte_user?

Can you give us the output of:
\d lte_user
\d candidates617004

So that we have the description of the tables, and what indexes you have
defined?

Also, if you could describe the table layouts, that would help.

John
=:-


 -  Sort  (cost=9912.07..9920.73 rows=3467 width=40) (actual
 time=1156.000..1156.000 rows=3467 loops=1)
   Sort Key: c.sourceid
   -  Hash Join  (cost=8710.44..9708.21 rows=3467 width=40)
 (actual time=1125.000..1156.000 rows=3467 loops=1)
 Hash Cond: (outer.targetid = inner.user_id)
 -  Seq Scan on candidates617004 c 
 (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
 rows=3467 loops=1)
 -  Hash  (cost=8011.95..8011.95 rows=279395
 width=16) (actual time=1125.000..1125.000 rows=0 loops=1)
   -  Seq Scan on lte_user t 
 (cost=0.00..8011.95 rows=279395 width=16) (actual time=0.000..670.000
 rows=279395 loops=1)
 Total runtime: 1406.000 ms
 
 enable_hashjoin disabled
 
 QUERY PLAN
 Sort  (cost=14355.37..14364.03 rows=3467 width=48) (actual
 time=391.000..391.000 rows=3467 loops=1)
   Sort Key: c.sourceid, c.targetid
   -  Nested Loop  (cost=271.52..14151.51 rows=3467 width=48) (actual
 time=203.000..359.000 rows=3467 loops=1)
 -  Merge Join  (cost=271.52..3490.83 rows=3467 width=40)
 (actual time=203.000..218.000 rows=3467 loops=1)
   Merge Cond: (outer.user_id = inner.sourceid)
   -  Index Scan using lte_user_pkey on lte_user s 
 (cost=0.00..16837.71 rows=279395 width=16) (actual time=0.000..126.000
 rows=50034 loops=1)
   -  Sort  (cost=271.52..280.19 rows=3467 width=32) (actual
 time=15.000..30.000 rows=3467 loops=1)
 Sort Key: c.sourceid
 -  Seq Scan on candidates617004 c 
 (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..0.000
 rows=3467 loops=1)
 -  Index Scan using lte_user_pkey on lte_user t 
 (cost=0.00..3.03 rows=1 width=16) (actual time=0.031..0.036 rows=1
 loops=3467)
   Index Cond: (outer.targetid = t.user_id)
 Total runtime: 406.000 ms
 
 random_page_cost set to 1.5
 --
 QUERY PLAN
 Sort  (cost=12702.62..12711.29 rows=3467 width=48) (actual
 time=1407.000..1407.000 rows=3467 loops=1)
   Sort Key: c.sourceid, c.targetid
   -  Merge Join  (cost=9912.07..12498.77 rows=3467 width=48) (actual
 time=1391.000..1407.000 rows=3467 loops=1)
 Merge Cond: (outer.user_id = inner.sourceid)
 -  Index Scan using lte_user_pkey on lte_user s 
 (cost=0.00..12807.34 rows=279395 width=16) (actual time=0.000..46.000
 rows=50034 loops=1)
 -  Sort  (cost=9912.07..9920.73 rows=3467 width=40) (actual
 time=1188.000..1188.000 rows=3467 loops=1)
   Sort Key: c.sourceid
   -  Hash Join  (cost=8710.44..9708.21 rows=3467 width=40)
 (actual time=1157.000..1188.000 rows=3467 loops=1)
 Hash Cond: (outer.targetid = inner.user_id)
 -  Seq Scan on candidates617004 c 
 (cost=0.00..67.67 rows=3467 width=32) (actual time=0.000..15.000
 rows=3467 loops=1)
  

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-31 Thread John Arbash Meinel
John Arbash Meinel wrote:

Matthew Schumacher wrote:

  

All it's doing is trying the update before the insert to get around the
problem of not knowing which is needed.  With only 2-3 of the queries
implemented I'm already back to running about the same speed as the
original SA proc that is going to ship with SA 3.1.0.

All of the queries are using indexes so at this point I'm pretty
convinced that the biggest problem is the sheer number of queries
required to run this proc 200 times for each email (once for each token).

I don't see anything that could be done to make this much faster on the
postgres end, it's looking like the solution is going to involve cutting
down the number of queries some how.

One thing that is still very puzzling to me is why this runs so much
slower when I put the data.sql in a transaction.  Obviously transactions
are acting different when you call a proc a zillion times vs an insert
query.
 



Well, I played with adding a COMMIT;BEGIN; statement to your exact test
every 1000 lines. And this is what I got:
  

Just for reference, I also tested this on my old server, which is a dual
Celeron 450 with 256M ram. FC4 and Postgres 8.0.3
Unmodified:
real54m15.557s
user0m24.328s
sys 0m14.200s

With Transactions every 1000 selects, and vacuum every 5000:
real8m36.528s
user0m16.585s
sys 0m12.569s

With Transactions every 1000 selects, and vacuum every 1:
real7m50.748s
user0m16.183s
sys 0m12.489s

On this machine vacuum is more expensive, since it doesn't have as much ram.

Anyway, on this machine, I see approx 7x improvement. Which I think is
probably going to satisfy your spamassassin needs.
John
=:-

PS Looking forward to having a spamassassin that can utilize my
favorite db. Right now, I'm not using a db backend because it wasn't
worth setting up mysql.

Unmodified:
real17m53.587s
user0m6.204s
sys 0m3.556s

With BEGIN/COMMIT:
real1m53.466s
user0m5.203s
sys 0m3.211s

So I see the potential for improvement almost 10 fold by switching to
transactions. I played with the perl script (and re-implemented it in
python), and for the same data as the perl script, using COPY instead of
INSERT INTO means 5s instead of 33s.

I also played around with adding VACUUM ANALYZE every 10 COMMITS, which
brings the speed to:

real1m41.258s
user0m5.394s
sys 0m3.212s

And doing VACUUM ANALYZE every 5 COMMITS makes it:
real1m46.403s
user0m5.597s
sys 0m3.244s

I'm assuming the slowdown is because of the extra time spent vacuuming.
Overall performance might still be improving, since you wouldn't
actually be inserting all 100k rows at once.
  

...

This is all run on Ubuntu, with postgres 7.4.7, and a completely
unchanged postgresql.conf. (But the machine is a dual P4 2.4GHz, with
3GB of RAM).

John
=:-
  




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-30 Thread John Arbash Meinel
Matthew Schumacher wrote:

All it's doing is trying the update before the insert to get around the
problem of not knowing which is needed.  With only 2-3 of the queries
implemented I'm already back to running about the same speed as the
original SA proc that is going to ship with SA 3.1.0.

All of the queries are using indexes so at this point I'm pretty
convinced that the biggest problem is the sheer number of queries
required to run this proc 200 times for each email (once for each token).

I don't see anything that could be done to make this much faster on the
postgres end, it's looking like the solution is going to involve cutting
down the number of queries some how.

One thing that is still very puzzling to me is why this runs so much
slower when I put the data.sql in a transaction.  Obviously transactions
are acting different when you call a proc a zillion times vs an insert
query.
  

Well, I played with adding a COMMIT;BEGIN; statement to your exact test
every 1000 lines. And this is what I got:

Unmodified:
real17m53.587s
user0m6.204s
sys 0m3.556s

With BEGIN/COMMIT:
real1m53.466s
user0m5.203s
sys 0m3.211s

So I see the potential for improvement almost 10 fold by switching to
transactions. I played with the perl script (and re-implemented it in
python), and for the same data as the perl script, using COPY instead of
INSERT INTO means 5s instead of 33s.

I also played around with adding VACUUM ANALYZE every 10 COMMITS, which
brings the speed to:

real1m41.258s
user0m5.394s
sys 0m3.212s

And doing VACUUM ANALYZE every 5 COMMITS makes it:
real1m46.403s
user0m5.597s
sys 0m3.244s

I'm assuming the slowdown is because of the extra time spent vacuuming.
Overall performance might still be improving, since you wouldn't
actually be inserting all 100k rows at once.


Just to complete the reference, the perl version runs as:
10:44:02 -- START
10:44:35 -- AFTER TEMP LOAD : loaded 120596 records
10:44:39 -- AFTER bayes_token INSERT : inserted 49359 new records into
bayes_token
10:44:41 -- AFTER bayes_vars UPDATE : updated 1 records
10:46:42 -- AFTER bayes_token UPDATE : updated 47537 records
DONE

My python version runs as:
00:22:54 -- START
00:23:00 -- AFTER TEMP LOAD : loaded 120596 records
00:23:03 -- AFTER bayes_token INSERT : inserted 49359 new records into
bayes_token
00:23:06 -- AFTER bayes_vars UPDATE : updated 1 records
00:25:04 -- AFTER bayes_token UPDATE : updated 47537 records
DONE

The python is effectively just a port of the perl code (with many lines
virtually unchanged), and really the only performance difference is that
the initial data load is much faster with a COPY.

This is all run on Ubuntu, with postgres 7.4.7, and a completely
unchanged postgresql.conf. (But the machine is a dual P4 2.4GHz, with
3GB of RAM).

John
=:-

Anyway, if anyone else has any ideas I'm all ears, but at this point
it's looking like raw query speed is needed for this app and while I
don't care for mysql as a database, it does have the speed going for it.

schu

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

  




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-30 Thread John Arbash Meinel
Matthew Schumacher wrote:

Tom Lane wrote:

  

I looked into this a bit.  It seems that the problem when you wrap the
entire insertion series into one transaction is associated with the fact
that the test does so many successive updates of the single row in
bayes_vars.  (VACUUM VERBOSE at the end of the test shows it cleaning up
49383 dead versions of the one row.)  This is bad enough when it's in
separate transactions, but when it's in one transaction, none of those
dead row versions can be marked fully dead yet --- so for every update
of the row, the unique-key check has to visit every dead version to make
sure it's dead in the context of the current transaction.  This makes
the process O(N^2) in the number of updates per transaction.  Which is
bad enough if you just want to do one transaction per message, but it's
intolerable if you try to wrap the whole bulk-load scenario into one
transaction.

I'm not sure that we can do anything to make this a lot smarter, but
in any case, the real problem is to not do quite so many updates of
bayes_vars.

How constrained are you as to the format of the SQL generated by
SpamAssassin?  In particular, could you convert the commands generated
for a single message into a single statement?  I experimented with
passing all the tokens for a given message as a single bytea array,
as in the attached, and got almost a factor of 4 runtime reduction
on your test case.

BTW, it's possible that this is all just a startup-transient problem:
once the database has been reasonably well populated, one would expect
new tokens to be added infrequently, and so the number of updates to
bayes_vars ought to drop off.

  regards, tom lane




The spamassassins bayes code calls the _put_token method in the storage
module a loop.  This means that the storage module isn't called once per
message, but once per token.
  

Well, putting everything into a transaction per email might make your
pain go away.
If you saw the email I just sent, I modified your data.sql file to add a
COMMIT;BEGIN every 1000 selects, and I saw a performance jump from 18
minutes down to less than 2 minutes. Heck, on my machine, the advanced
perl version takes more than 2 minutes to run. It is actually slower
than the data.sql with commit statements.

I'll look into modifying it to so that the bayes code passes a hash of
tokens to the storage module where they can loop or in the case of the
pgsql module pass an array of tokens to a procedure where we loop and
use temp tables to make this much more efficient.
  

Well, you could do that. Or you could just have the bayes code issue
BEGIN; when it starts processing an email, and a COMMIT; when it
finishes. From my testing, you will see an enormous speed improvement.
(And you might consider including a fairly frequent VACUUM ANALYZE)

I don't have much time this weekend to toss at this, but will be looking
at it on Monday.
  

Good luck,
John
=:-

Thanks,

schu

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

  




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-29 Thread John Arbash Meinel
Josh Berkus wrote:

Dennis,

  

  EXCEPTION WHEN unique_violation THEN



I seem to remember that catching an exception in a PL/pgSQL procedure was a 
large performance cost.   It'd be better to do UPDATE ... IF NOT FOUND.

  

Actually, he was doing an implicit UPDATE IF NOT FOUND in that he was doing:

UPDATE

IF found THEN return;

INSERT
EXCEPT
...

So really, the exception should never be triggered.
John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] autovacuum suggestions for 500,000,000+ row tables?

2005-06-20 Thread John Arbash Meinel
Alex Stapleton wrote:


 On 20 Jun 2005, at 15:59, Jacques Caron wrote:

...

 ANALYZE is not a very expensive operation, however VACUUM can
 definitely be a big strain and take a long time on big tables,
 depending on your setup. I've found that partitioning tables (at  the
 application level) can be quite helpful if you manage to keep  each
 partition to a reasonable size (under or close to available  memory),
 especially if the partitioning scheme is somehow time- related. YMMV.

 Jacques.


 That's not currently an option as it would require a pretty large
 amount of work to implement. I think we will have to keep that in
 mind though.

Remember, you can fake it with a low-level set of tables, and then wrap
them into a UNION ALL view.
So you get something like:

CREATE VIEW orig_table AS
SELECT * FROM table_2005_04
UNION ALL SELECT * FROM table_2005_05
UNION ALL SELECT * FROM table_2005_06
...
;

Then at least your individual operations are fast. As you insert, you
can create a rule that on insert into orig_table do instead ... insert
into table_2005_07 (or whatever the current table is).
It takes a little bit of maintenance on the DB admin's part, since every
month they have to create a new table, and then update all of the views
and triggers. But it is pretty straightforward.
If you are doing append-only inserting, then you have the nice feature
that only the last table is ever modified, which means that the older
tables don't really need to be vacuumed or analyzed.
And even if you have to have each table modified as you go, you still
can break up a VACUUM into only doing one of the sub tables at a time.

I don't know you db schema, but I thought I would mention that true
partitioning isn't implemented yet, you can still get something very
similar with views, triggers and rules.

John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] slow queries, possibly disk io

2005-05-26 Thread John Arbash Meinel
Josh Close wrote:

I have some queries that have significan't slowed down in the last
couple days. It's gone from 10 seconds to over 2 mins.

The cpu has never gone over 35% in the servers lifetime, but the load
average is over 8.0 right now. I'm assuming this is probably due to
disk io.

I need some help setting up postgres so that it doesn't need to go to
disk. I think the shared_buffers and effective_cache_size values are
the one's I need to look at.

Would setting shmmax and smmall to 90% or so of available mem and
putting a lot for postgres be helpful?


Setting shared buffers above something like 10-30% of memory is counter
productive.

Effective cach size says this:
Sets the planner's assumption about the effective size of the disk
cache (that is, the portion of the kernel's disk cache that will be
used for PostgreSQL data files).

Does that mean the total available ram? Or what's left over from 
shared_buffers?

I've tried different things and not much has been working. Is there a
good way to ensure that most of the tables accessed in postgres will
be cached in mem and not have to go to disk?

If I'm joining a lot of tables, should the sort_mem be set high also?
Do shared_buffers, effective_cache_size, and sort_mem all use
different mem? Or are they seperate?



Increasing sort_mem can help with various activities, but increasing it
too much can cause you to swap, which kills performance. The caution is
that you will likely use at least 1 sort_mem per connection, and can
likely use more than one if the query is complicated.

effective_cache_size changes how Postgres plans queries, but given the
same query plan, it doesn't change performance at all.

I've looked for information and haven't found any useful pages about this.

Any help would be greatly appreciated.

Thanks.

-Josh



John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Optimize complex join to use where condition before

2005-05-13 Thread John Arbash Meinel
Sebastian Hennebrueder wrote:

 I found a solution to improve my query. I do not know why but the
 statistics for all column has been 0.
 I changed this to 10 for index columns and to 20 for all foreign key
 columns.
 and to 100 for foreign key columns.
 I set the random page cost to 2
 and now the query runs as expected.

 Many thanks to all of the posts in my and in other threads which
 helped a lot.

 Sebastian


I think 0 = use default. But still, changing to 20 and 100 probably
fixes your problems.

John
=:-



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Any way to speed this up?

2005-04-07 Thread John Arbash Meinel
Joel Fradkin wrote:
Running this explain on windows box, but production on linux both 8.0.1
The MSSQL is beating me out for some reason on this query.
The linux box is much more powerful, I may have to increase the cache,
but I am pretty sure its not an issue yet.
It has 8 gig internal memory any recommendation on the cache size to use?

explain analyze select * from viwassoclist where clientnum = 'SAKS'

Merge Join  (cost=59871.79..60855.42 rows=7934 width=112) (actual
time=46906.000..48217.000 rows=159959 loops=1)
The first thing I noticed was this. Notice that the estimated rows is
8k, the actual rows is 160k. Which means the planner is mis-estimating
the selectivity of your merge.
  -  Sort  (cost=59478.03..59909.58 rows=172618 width=75) (actual
time=46844.000..46985.000 rows=159960 loops=1)
Sort Key: a.locationid
This sort actually isn't taking very long. It starts at 46800 and runs
until 47000 so it takes  1 second.
-  Merge Right Join  (cost=0.00..39739.84 rows=172618
width=75) (actual time=250.000..43657.000 rows=176431 loops=1)
  Merge Cond: (((outer.clientnum)::text =
(inner.clientnum)::text) AND (outer.id = inner.jobtitleid))
  -  Index Scan using ix_tbljobtitle_id on tbljobtitle
jt  (cost=0.00..194.63 rows=6391 width=37) (actual
time=32.000..313.000 rows=5689 loops=1)
Filter: (1 = presentationid)
  -  Index Scan using ix_tblassoc_jobtitleid on
tblassociate a  (cost=0.00..38218.08 rows=172618 width=53) (actual
time=31.000..41876.000 rows=176431 loops=1)
Index Cond: ((clientnum)::text = 'SAKS'::text)
This is where the actual expense is. The merge right join starts at 250,
and runs until 43000. Which seems to be caused primarily by the index
scan of tblassociate. How many rows are in tblassociate? I'm assuming
quite a bit, since the planner thinks an index scan is faster than seq
scan for 170k rows. (If you have  2M this is probably accurate)
I don't really know how long this should take, but 38s for 172k rows
seems a little long.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Any way to speed this up?

2005-04-07 Thread John Arbash Meinel
Joel Fradkin wrote:
shared_buffers = 8000   # min 16, at least max_connections*2, 8KB
each
work_mem = 8192#1024# min 64, size in KB
max_fsm_pages = 3   # min max_fsm_relations*16, 6 bytes each
effective_cache_size = 4 #1000  # typically 8KB each
random_page_cost = 1.2#4# units are one sequential page
fetch cost
These are the items I changed.
In the development box I turned random page cost to .2 because I figured it
would all be faster using an index as all my data is at a minimum being
selected by clientnum.

You're random page cost is *way* too low. I would probably change this
to no less that 2.0.
But the analyze I sent in is from these settings above on a windows box.
If I was running the analyze (pgadmin) on a windows box but connecting to a
linux box would the times be accurate or do I have to run the analyze on the
linux box for that to happen?

EXPLAIN ANALYZE is done on the server side, so it doesn't matter what
you use to connect to it. The \timing flag occurs on the local side, and
is thus influenced my network latency (but it only tells you the time
for the whole query anyway).
I am a little unclear why I would need an index on associate by location as
I thought it would be using indexes in location and jobtitle for their
joins.
I did not say where locationid = x in my query on the view.
I have so much to learn about SQL.
Joel
CREATE OR REPLACE VIEW viwassoclist AS
 SELECT a.clientnum, a.associateid, a.associatenum, a.lastname,
a.firstname, jt.value AS jobtitle, l.name AS location, l.locationid
AS mainlocationid, l.divisionid, l.regionid, l.districtid,
(a.lastname::text || ', '::text) || a.firstname::text AS assocname,
a.isactive, a.isdeleted
   FROM tblassociate a
   LEFT JOIN tbljobtitle jt ON a.jobtitleid = jt.id AND
jt.clientnum::text = a.clientnum::text AND 1 = jt.presentationid
   JOIN tbllocation l ON a.locationid = l.locationid AND
l.clientnum::text = a.clientnum::text;
  ^^^
The locationid is defined in your view. This is the part that postgres
uses to merge all of the different tables together, it doesn't really
matter whether you restrict it with a WHERE clause.
Try just setting your random page cost back to something more
reasonable, and try again.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Any way to speed this up?

2005-04-07 Thread John Arbash Meinel
Joel Fradkin wrote:
Here is the result after putting it back to 4 the original value (I had done
that prior to your suggestion of using 2 or 3) to see what might change.
I also vacummed and thought I saw records deleted in associate, which I
found odd as this is a test site and no new records were added or deleted.

Well, that looks 2x as fast, right?
You might try
SET enable_mergejoin TO off;
Just to see if you can force a hash-join and see how long that takes.
You might also try increasing work_mem.
You can do that just in the current session with
SET work_mem TO ;
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Dynamic query perormance

2005-03-30 Thread John Arbash Meinel
Keith Worthington wrote:
Hi All,
I am developing a simple set returning function as my first step towards more
complicated processes.  I would like to understand the implications of using
the dynamic query capability.
I have built two forms of an identically performing function.  The first uses
a simple IF-THEN-ELSIF-THEN-ELSE structure to choose which query to run.  The
second builds the query dynamically using the FOR-IN-EXECUTE structure and a
CASE statement.
The documentation
(http://www.postgresql.org/docs/8.0/interactive/plpgsql-control-structures.html#PLPGSQL-RECORDS-ITERATING)
indicates that a dynamic query (EXECUTE) is replanned for every LOOP iteration.
 This is like the previous form, except that the source
 SELECT statement is specified as a string expression,
 which is evaluated and replanned on each entry to the
 FOR loop. This allows the programmer to choose the speed
 of a preplanned query or the flexibility of a dynamic
 query, just as with a plain EXECUTE statement.
That seems like a potential performance problem.  I don't understand why the
query would be planned for every LOOP iteration when the LOOP is over the
record set.

Reading the documentation and looking at the example, I don't think
you're query will be re-planned for each entry in the loop.
I think it will be planned each time the FOR loop is started.
If you have the EXECUTE *inside* the LOOP, then it would be re-planned
for each entry.
At least that is the case for a normal EXECUTE without any for loop.
Each time the function is called, the statement is re-planned. Versus
without EXECUTE when the planning is done at function declaration time.
I would guess that the FOR .. IN EXECUTE .. LOOP runs the EXECUTE one
time, and generates the results which it then loops over. Because that
is what FOR .. IN SELECT .. LOOP does (you don't re-evaluate the SELECT
for each item in the result set).
On the other hand, I don't know of any way to test this, unless you have
a query that you know takes a long time to plan, and can compare the
performance of FOR IN EXECUTE versus FOR IN SELECT.
John
=:-
Your comments are appreciated.
Kind Regards,
Keith




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] fine tuning for logging server

2005-03-30 Thread John Arbash Meinel
Florin Andrei wrote:
Hardware: relatively modern Intel CPU, OS and database each on its own
IDE hard-drive (separate IDE cables). Enough memory, i think, but i
can't add too much (not beyond 1GB).
Software: Linux-2.6, pgsql-8.0.1
Function: Essentially a logging server. There are two applications (like
syslog) on the same box that are logging to pgsql, each one to its own
database. There are a few tables in one DB, and exactly one table in the
other.
Most of the time, the apps are just doing mindless INSERTs to the DB.
Every now and then, an admin performs some SELECTs via a PHP interface.
Objective: Make the DB as fast as possible. Of course i'd like the
SELECTs to be fast, but the INSERTs take precedence. It's gotta be able
to swallow as many messages per second as possible given the hardware.
Question: What are the pgsql parameters that need to be tweaked? What
are the guidelines for such a situation?

Put pg_xlog onto the same drive as the OS, not the drive with the database.
Do as many inserts per transaction that you can get away with.
100-1000 is pretty good.
Keep the number of indexes and foreign key references low to keep
INSERTS fast.
Keep a few indexes around to keep SELECTs reasonable speedy.
If you are doing lots and lots of logging, need only archival and slow
access for old data, but fast access on new data, consider partitioning
your table, and then using a view to join them back together.
If you are only having a couple processing accessing the db at any given
time, you can probably increase work_mem and maintenance_work_mem a bit.
If you have 1G ram, maybe around 50M for work_mem. But really this is
only if you have 1-3 selects going on at a time.
With 2 disks, and fixed hardware, it's a lot more about configuring your
schema and the application. If you want more performance, adding more
disks is probably the first thing to do.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] fine tuning for logging server

2005-03-30 Thread John Arbash Meinel
Florin Andrei wrote:
On Wed, 2005-03-30 at 17:50 -0800, Florin Andrei wrote:

Function: Essentially a logging server. There are two applications (like
syslog) on the same box that are logging to pgsql, each one to its own
database. There are a few tables in one DB, and exactly one table in the
other.
Most of the time, the apps are just doing mindless INSERTs to the DB.
Every now and then, an admin performs some SELECTs via a PHP interface.

For performance reasons, i was thinking to keep the tables append-only,
and simply rotate them out every so often (daily?) and delete those
tables that are too old. Is that a good idea?

If you aren't doing updates, then I'm pretty sure the data stays packed
pretty well. I don't know that you need daily rotations, but you
certainly could consider some sort of rotation schedule.
The biggest performance improvement, though, is probably to group
inserts into transactions.
I had an application (in a different db, but it should be relevant),
where using a transaction changed the time from 6min - 6 sec.
It was just thrashing on all the little inserts that it had to fsync to
disk.
How fast is fast? How many log messages are you expecting? 1/s 100/s 1000/s?
I think the hardware should be capable of the 10-100 range if things are
properly configured. Naturally that depends on all sorts of factors, but
it should give you an idea.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] fine tuning for logging server

2005-03-30 Thread John Arbash Meinel
Florin Andrei wrote:
On Wed, 2005-03-30 at 19:59 -0600, John Arbash Meinel wrote:

Put pg_xlog onto the same drive as the OS, not the drive with the database.

I forgot to mention: the OS drive is purposefully made very slow - the
write cache is turned off and the FS is Ext3 with data=journal. Is then
still ok to put pg_xlog on it?
The reason: if the power cord is yanked, the OS _must_ boot back up in
good condition. If the DB is corrupted, whatever, nuke it then re-
initialize it. But the OS must survive act-of-god events.
No, there is no uninterruptible power supply. It sucks, but that's how
it is. I cannot change that.

You don't want write cache for pg_xlog either. And you could always
create a second partition that used reiserfs, or something like that.
If you have to survine act-of-god you probably should consider making
the system into a RAID1 instead of 2 separate drives (software RAID
should be fine).
'Cause a much worse act-of-god is having a drive crash. No matter what
you do in software, a failed platter will prevent you from booting. RAID
1 at least means 2 drives have to die.
If you need insert speed, and can't do custom transactions at the
application side, you could try creating a RAM disk for the insert
table, and then create a cron job that bulk pulls it out of that table
and inserts it into the rest of the system. That should let you get a
super-fast insert speed, and the bulk copies should stay reasonably fast.
Just realize that if your cron job stops running, your machine will
slowly eat up all of it's ram, and really not play nice. I think adding
an extra hard-drive is probably the best way to boost performance and
reliability, but if you have a $0 budget, this is a possibility.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] fine tuning for logging server

2005-03-30 Thread John Arbash Meinel
Florin Andrei wrote:
On Wed, 2005-03-30 at 20:11 -0600, John Arbash Meinel wrote:

Florin Andrei wrote:

For performance reasons, i was thinking to keep the tables append-only,
and simply rotate them out every so often (daily?) and delete those
tables that are too old. Is that a good idea?

If you aren't doing updates, then I'm pretty sure the data stays packed
pretty well. I don't know that you need daily rotations, but you
certainly could consider some sort of rotation schedule.

(sorry for re-asking, i'm coming from a mysql mindset and i still have a
lot to learn about pgsql)
So, it is indeed a bad idea to delete rows from tables, right? Better
just rotate to preserve the performance.

The only problems are if you get a lot of old tuples in places you don't
want them. If you are always appending new values that are increasing,
and you are deleting from the other side, I think vacuum will do a fine
job at cleaning up. It's deleting/updating every 3rd entry that starts
to cause holes (though probably vacuum still does a pretty good job).
Daily rotation may simplify the application logic - then i'll know that
each table is one day's worth of data.

I don't think it is necessary, but if you like it, go for it. I would
tend to think that you would want a today table, and a everything
else table, as it simplifies your queries, and lets you have foreign
keys (though if you are from mysql, you may not be used to using them.)
The biggest performance improvement, though, is probably to group
inserts into transactions.

Yes, i know that. I have little control over the apps, though. I'll see
what i can do.

You could always add a layer inbetween. Or look at my mention of a fast
temp table, with a periodic cron job to pull in the new data. You can
run cron as fast as 1/min which might be just right depending on your needs.
It also means that you could ignore foreign keys and indexes on the temp
table, and only evaluate them on the main table.

How fast is fast? How many log messages are you expecting? 1/s 100/s 1000/s?

More is better. shrug
I guess i'll put it together and give it a spin and see just how far it
goes.
I actually have some controls over the data that's being sent (in some
places i can limit the number of events/second), so that might save me
right there.

Good luck. And remember, tuning your queries can be just as important.
(Though if you are doing append only inserts, there probably isn't much
that you can do).
If all you are doing is append only logging, the fastest thing is
probably just a flat file. You could have something that comes along
later to move it into the database. It doesn't really sound like you are
using any features a database provides. (normalization, foreign keys,
indexes, etc.)
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Building a DB with performance in mind

2005-03-17 Thread John Arbash Meinel
Alexander Ranaldi wrote:
Greetings everyone,
I am about to migrate to Postgres from MySQL. My DB isn't enormous (
1gb), consists mostly of just text, but is accessed quite heavily.
Because size isn't a huge issue, but performance is, I am willing to
normalize as necessary.
Currently I have a table Entries containing 500k rows. The table
contains many text columns, and a few others:
EntryID (unique, indexed)
UserID (references Users table, indexed)
Private (boolean. indexed)
Most of my queries return rows based on UserID, and also only if
Private is FALSE. Would it be in the interest of best performance to
split this table into two tables: EntriesPrivate,
EntriesNotPrivate and remove the Private column?

Perhaps. You might also consider creating a multi-column index on
(UserID, Private).
However, in a more conceptual idea, separating the tables may help you
with preventing accidental queries. It's pretty easy to forget to add
... AND Private = False. It is much harder to accidentally add ...
JOIN EntriesPrivate ON ...
I appreciate any feedback. I'm certainly not a DB design expert. :)

It shouldn't be very hard to test which one works better for you:
\timing
CREATE INDEX entries_user_private_idx ON Entries(UserID, Private);
SELECT * FROM Entries WHERE ... AND Private = False;
CREATE TABLE EntriesPrivate AS SELECT * FROM Entries WHERE Private=True;
CREATE TABLE EntriesPublic AS SELECT * FROM Entries WHERE Private=False;
ALTER TABLE EntriesPrivate DROP COLUMN Private;
ALTER TABLE EntriesPrivate ADD PRIMARY KEY (EntriesID);
ALTER TABLE EntriesPrivate ALTER COLUMN SET
DEFAULT=nextval('Entries_...EntryId');
-- Make sure you don't have duplicate entries. This could also be done
with a foreign key to some
-- other entries table
ALTER TABLE EntriesPrivate ADD CONSTRAINT EntriesID NOT in (SELECT
EntriesId FROM EntriesPublic);
CREATE INDEX entriesprivate_userid_idx ON EntriesPrivate(UserID);
-- Do the same thing for EntriesPublic
ALTER TABLE EntriesPublic DROP COLUMN Private;
These queries have not been tested, but they should give you a decent
starting point to creating 2 tables, and running a bunch of test queries
on them.
I think the biggest difficulty is making sure that you don't get
duplicate EntriesID values, assuming that is important to you.
Also, if you have foreign key references, this won't work. You'll have
to create a new table (it can have just 1 column) containing EntriesID,
and then you can reference that column from both of these tables.
John
=:-
Thanks,
Alex
---(end of broadcast)---
TIP 8: explain analyze is your friend




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to read query plan

2005-03-14 Thread John Arbash Meinel
Miroslav ulc wrote:
Tom Lane wrote:
...
I think the reason this is popping to the top of the runtime is that the
joins are so wide (an average of ~85 columns in a join tuple according
to the numbers above).  Because there are lots of variable-width columns
involved, most of the time the fast path for field access doesn't apply
and we end up going to nocachegetattr --- which itself is going to be
slow because it has to scan over so many columns.  So the cost is
roughly O(N^2) in the number of columns.

As there are a lot of varchar(1) in the AdDevicesSites table, wouldn't
be helpful to change them to char(1)? Would it solve the
variable-width problem at least for some fields and speed the query up?
I'm guessing there really wouldn't be a difference. I think varchar()
and char() are stored the same way, just one always has space padding. I
believe they are both varlena types, so they are still variable length.
As a short-term hack, you might be able to improve matters if you can
reorder your LEFT JOINs to have the minimum number of columns
propagating up from the earlier join steps.  In other words make the
later joins add more columns than the earlier, as much as you can.

That will be hard as the main table which contains most of the fields
is LEFT JOINed with the others. I'll look at it if I find some way to
improve it.
One thing that you could try, is to select just the primary keys from
the main table, and then later on, join back to that table to get the
rest of the columns. It is a little bit hackish, but if it makes your
query faster, you might want to try it.
I'm not sure whether I understand the process of performing the plan
but I imagine that the data from AdDevicesSites are retrieved only
once when they are loaded and maybe stored in memory. Are the columns
stored in the order they are in the SQL command? If so, wouldn't it be
useful to move all varchar fields at the end of the SELECT query? I'm
just guessing because I don't know at all how a database server is
implemented and what it really does.
I don't think they are stored in the order of the SELECT  portion. I'm
guessing they are loaded and saved as you go. But that the order of the
LEFT JOIN at the end is probably important.
..
regards, tom lane

Miroslav

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to read query plan

2005-03-14 Thread John Arbash Meinel
Miroslav ¦ulc wrote:
PFC wrote:
Your query seems of the form :
SELECT FROM main_table LEFT JOIN a lot of tables ORDER BY
sort_key LIMIT  N OFFSET M;
I would suggest to rewrite it in a simpler way : instead of
generating  the whole result set, sorting it, and then grabbing a
slice, generate only  the ror id's, grab a slice, and then generate
the full rows from that.
- If you order by a field which is in main_table :
SELECT FROM main_table LEFT JOIN a lot of tables WHERE
main_table.id IN  (SELECT id FROM main_table ORDER BY sort_key LIMIT
N OFFSET M
) ORDER BY sort_key LIMIT N OFFSET M;
- If you order by a field in one of the child tables, I guess you
only  want to display the rows in the main table which have this
field, ie.  not-null in the LEFT JOIN. You can also use the principle
above.
- You can use a straight join instead of an IN.

Do you mean something like this?
SELECT Table.IDPK, Table2.varchar1, Table2.varchar2, ...
FROM Table
LEFT JOIN many tables
INNER JOIN Table AS Table2
Miroslav
I would also recommend using the subselect format. Where any columns
that you are going to need to sort on show up in the subselect.
So you would have:
SELECT ...
   FROM main_table
   LEFT JOIN tablea ON ...
   LEFT JOIN tableb ON ...
   ...
   JOIN other_table ON ...
   WHERE main_table.idpk IN
  (SELECT idpk
   FROM main_table JOIN other_table ON main_table.idpk =
other_table.main_idpk
   WHERE ...
   ORDER BY other_table.abcd LIMIT n OFFSET m)
;
I think the final LIMIT + OFFSET would give you the wrong results, since
you have already filtered out the important rows.
I also think you don't need the final order by, since the results should
already be in sorted order.
Now this also assumes that if someone is sorting on a row, then they
don't want null entries. If they do, then you can change the subselect
into a left join. But with appropriate selectivity and indexes, an inner
join can filter out a lot of rows, and give you better performance.
The inner subselect gives you selectivity on the main table, so that you
don't have to deal with all the columns in the search, and then you
don't have to deal with all the rows later on.
I think you can also do this:
SELECT ...
   FROM (SELECT main_table.idpk, other_table.columns FROM main_table
JOIN other_table ) as p
   LEFT JOIN ...
   JOIN main_table ON main_table.idpk = p.idpk;
In that case instead of selecting out the id and putting that into the
where, you put it in the from, and then join against it.
I don't really know which is better.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to read query plan

2005-03-13 Thread John Arbash Meinel
Miroslav ulc wrote:
Hi all,
I am new to PostgreSQL and query optimizations. We have recently moved
our project from MySQL to PostgreSQL and we are having performance
problem with one of our most often used queries. On MySQL the speed
was sufficient but PostgreSQL chooses time expensive query plan. I
would like to optimize it somehow but the query plan from EXPLAIN
ANALYZE is little bit cryptic to me.
So the first thing I would like is to understand the query plan. I
have read performance tips and FAQ but it didn't move me too much
further.
I would appreciate if someone could help me to understand the query
plan and what are the possible general options I can test. I think at
this moment the most expensive part is the Sort. Am I right? If so,
how could I generally avoid it (turning something on or off, using
parentheses for JOINs etc.) to force some more efficient query plan?
Thank you for any suggestions.
You really need to post the original query, so we can see *why* postgres
thinks it needs to run the plan this way.
Also, the final sort actually isn't that expensive.
When you have the numbers (cost=xxx..yyy) the xxx is the time when the
step can start, and the yyy is the time when the step can finish. For a
lot of steps, it can start running while the sub-steps are still feeding
back more data, for others, it has to wait for the sub-steps to finish.
The first thing to look for, is to make sure the estimated number of
rows is close to the actual number of rows. If they are off, then
postgres may be mis-estimating the optimal plan. (If postgres thinks it
is going to only need 10 rows, it may use an index scan, but when 1000
rows are returned, a seq scan might have been faster.)
You seem to be doing a lot of outer joins. Is that necessary? I don't
really know what you are looking for, but you are joining against enough
tables, that I think this query is always going to be slow.
From what I can tell, you have 1 table which has 6364 rows, and you are
grabbing all of those rows, and then outer joining it with about 11
other tables.
I would actually guess that the most expensive parts of the plan are the
NESTED LOOPS which when they go to materialize have to do a sequential
scan, and they get executed 6364 times. It looks like the other tables
are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan,
the problem is that because you are doing it 6k times, it ends up taking
about 300ms of your time.
You could try setting set enable_nestloop to off.
I don't know that it will be faster, but it could be.
In general, though, it seems like you should be asking a different
question, rather than trying to optimize the query that you have.
Can you post the original SQL statement, and maybe describe what you are
trying to do?
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to read query plan

2005-03-13 Thread John Arbash Meinel
Miroslav ulc wrote:
Hi John,
thank you for your response.
How about a quick side track.
Have you played around with your shared_buffers, maintenance_work_mem,
and work_mem settings?
What version of postgres are you using? The above names changed in 8.0,
and 8.0 also has some perfomance improvements over the 7.4 series.
What is your hardware? Are you testing this while there is load on the
system, or under no load.
Are you re-running the query multiple times, and reporting the later
speeds, or just the first time? (If nothing is loaded into memory, the
first run is easily 10x slower than later ones.)
Just some background info. If you have set these to reasonable values,
we probably don't need to spend much time here, but it's just one of
those things to check.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] How to read query plan

2005-03-13 Thread John Arbash Meinel
Miroslav ulc wrote:
Hi John,
thank you for your response.
I will comment on things separately.
John Arbash Meinel wrote:
...
These external tables contain information that are a unique parameter
of the AdDevice (like Position, Region, County, City etc.), in some
containing localized description of the property attribute. Some of
them could be moved into the main table but that would create a
redundancy, some of them cannot be moved into the main table (like
information about Partners which is definitely another object with
respect to AdDevices). I think the names of the tables are
self-explanatory so it should be clear what each table stores. Is this
design incorrect?
It's actually more of a question as to why you are doing left outer
joins, rather than simple joins.
Are the tables not fully populated? If so, why not?
How are you using this information? Why is it useful to get back rows
that don't have all of their information filled out?
Why is it useful to have so many columns returned? It seems like it most
cases, you are only going to be able to use *some* of the information,
why not create more queries that are specialized, rather than one get
everything query.
In fact, we only need about 30 records at a time but LIMIT can
speed-up the query only when looking for the first 30 records. Setting
OFFSET slows the query down.
Have you thought about using a cursor instead of using limit + offset?
This may not help the overall time, but it might let you split up when
the time is spent.
BEGIN;
DECLARE cursor_name CURSOR FOR SELECT ... FROM ...;
FETCH FORWARD 30 FROM cursor_name;
FETCH FORWARD 30 FROM cursor_name;
...
END;
I don't
really know what you are looking for, but you are joining against enough
tables, that I think this query is always going to be slow.

In MySQL the query was not so slow and I don't see any reason why
there should be large differences in SELECT speed. But if the design
of the tables is incorrect, we will correct it.
In the other post I asked about your postgres settings. The defaults are
pretty stingy, so that *might* be an issue.
From what I can tell, you have 1 table which has 6364 rows, and you are
grabbing all of those rows, and then outer joining it with about 11
other tables.

Here are the exact numbers:
AdDevicesSites - 6364
AdDevicesSiteRegions - 15
AdDevicesSiteCounties - 110
AdDevicesSiteCities - 124
AdDevicesSiteStreets - 2858
AdDevicesSiteStreetDescriptions - 4585
AdDevicesSiteDistricts - 344
AdDevicesSiteSizes - 110
AdDevicesSiteVisibilities - 4
AdDevicesSitePositions - 3
AdDevicesSiteStatusTypes - 5
PartnerIdentifications - 61
Partners - 61
CadastralUnits - 13027
And if I understand correctly, you consider all of these to be outer
joins. Meaning you want *all* of AdDevicesSites, and whatever info goes
along with it, but there are no restrictions as to what rows you want.
You want everything you can get.
Do you actually need *everything*? You mention only needing 30, what for?
I would actually guess that the most expensive parts of the plan are the
NESTED LOOPS which when they go to materialize have to do a sequential
scan, and they get executed 6364 times. It looks like the other tables
are small (only 3-5 rows), so it takes about 0.05 ms for each seqscan,
the problem is that because you are doing it 6k times, it ends up taking
about 300ms of your time.
You could try setting set enable_nestloop to off.
I don't know that it will be faster, but it could be.

I have tried that and it resulted in about 2 sec slowdown :-(
Generally, the optimizer *does* select the best query plan. As long as
it has accurate statistics, which it seems to in this case.

In general, though, it seems like you should be asking a different
question, rather than trying to optimize the query that you have.

You mean how should I improve the design to make the query faster?
There is one possibility if we don't find anything nicer. Which is to
create a lazy materialized view. Basically, you run this query, and
store it in a table. Then when you want to do the SELECT, you just do
that against the unrolled table.
You can then create triggers, etc to keep the data up to date.
Here is a good documentation of it:
http://jonathangardner.net/PostgreSQL/materialized_views/matviews.html
It is basically a way that you can un-normalize data, in a safe way.
Also, another thing that you can do, is instead of using a cursor, you
can create a temporary table with the results of the query, and create a
primary key which is just a simple counter. Then instead of doing limit
+ offset, you can select * where id  0 and id  30; ... select * where
id  30 and id  60; etc.
It still requires the original query to be run, though, so it is not
necessarily optimal for you.
Can you post the original SQL statement, and maybe describe what you are
trying to do?

I hope the explanation above is clear and sufficient :-)
John
=:-

Unfortunately, I don't really see any obvious problems with your query
in the way that you are using

Re: [PERFORM] How to read query plan

2005-03-13 Thread John Arbash Meinel
Miroslav ulc wrote:
John Arbash Meinel wrote:
...
Many of the columns are just varchar(1) (because of the migration from
MySQL enum field type) so the record is not so long as it could seem.
These fields are just switches (Y(es) or N(o)). The problem is users
can define their own templates and in different scenarios there might
be displayed different information so reducing the number of fields
would mean in some cases it wouldn't work as expected. But if we
couldn't speed the query up, we will try to improve it other way.
Is there any serious reason not to use so much fields except memory
usage? It seems to me that it shouldn't have a great impact on the
speed in this case.
Is there a reason to use varchar(1) instead of char(1). There probably
is 0 performance difference, I'm just curious.

Have you thought about using a cursor instead of using limit + offset?
This may not help the overall time, but it might let you split up when
the time is spent.
..

No. I come from MySQL world where these things are not common (at
least when using MyISAM databases). The other reason (if I understand
it well) is that the retrieval of the packages of 30 records is not
sequential. Our app is web based and we use paging. User can select
page 1 and then page 10, then go backward to page 9 etc.
Well, with cursors you can also do FETCH ABSOLUTE 1 FROM
cursor_name, which sets the cursor position, and then you can FETCH
FORWARD 30.
I honestly don't know how the performance will be, but it is something
that you could try.
And if I understand correctly, you consider all of these to be outer
joins. Meaning you want *all* of AdDevicesSites, and whatever info goes
along with it, but there are no restrictions as to what rows you want.
You want everything you can get.
Do you actually need *everything*? You mention only needing 30, what
for?

For display of single page consisting of 30 rows. The reason I query
all rows is that this is one of the filters users can use. User can
display just bigboards or billboards (or specify more advanced
filters) but he/she can also display AdDevices without any filter
(page by page). Before I select the 30 row, I need to order them by a
key and after that select the records, so this is also the reason why
to ask for all rows. The key for sorting might be different for each run.
How are you caching the information in the background in order to
support paging? Since you aren't using limit/offset, and you don't seem
to be creating a temporary table, I assume you have a layer inbetween
the web server and the database (or possibly inside the webserver) which
keeps track of current session information. Is that true?
These might be the other steps in case we cannot speed-up the query. I
would prefer to speed the query up :-)
Naturally fast query comes first. I just have the feeling it is either a
postgres configuration problem, or an intrinsic problem to postgres.
Given your constraints, there's not much that we can change about the
query itself.
In fact, on MySQL I didn't see any slow reactions so I didn't measure
and inspect it. But I can try it if I figure out how to copy the
database from PostgreSQL to MySQL.
I figured you still had a copy of the MySQL around to compare to. You
probably don't need to spend too much time on it yet.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] What's better: Raid 0 or disk for seperate pg_xlog

2005-03-10 Thread John Arbash Meinel
Karim Nassar wrote:
Thanks to all for the tips.
 

...
In general I would recommend RAID1, because that is the safe bet. If 
your db is the bottleneck, and your data isn't all that critical, and 
you are read heavy, I would probably go with RAID1, if you are write 
   

^ - RAID0
heavy I would say 2 independent disks.
   

I feel that we have enough data safety such that I want to go for speed.
Some of the queries are very large joins, and I am going for pure
throughput at this point - unless someone can find a hole in my backup
tactic.
Of course, later we will have money to throw at more spindles. But for
now, I am trying gaze in to the future and maximize my current
capabilities.
Seems to me that the best solution would be:
* disk 0 partition 1..n - os mounts
partition n+1  - /var/lib/postgres/data/pg_xlog
* disk 1 partition 1- /var/lib/postgres/data
* Further (safe) performance gains can be had by adding more spindles as
such: 
- first disk: RAID1 to disk 1
- next 2 disks: RAID 0 across the above
 

Sounds decent to me.
I did make the mistake that you might want to consider a RAID0. But the 
performance gains might be small, and you potentially lose everything.
But your update strategy seems dead on.

Do I grok it?
Thanks again,
 

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread John Arbash Meinel
Ken wrote:
Richard,
What do you mean by summary table?  Basically a cache of the query
into a table with replicated column names of all the joins?  I'd
probably have to whipe out the table every minute and re-insert the
data for each carrier in the system.  I'm not sure how expensive this
operation would be, but I'm guessing it would be fairly heavy-weight.
And maintaince would be a lot harder because of the duplicated
columns, making refactorings on the database more error-prone.  Am I
understanding your suggestion correctly? Please correct me if I am.
Can you turn the problem around? Calculate what you want for all
users (once every 60 seconds) and stuff those results into a summary
table. Then let the users query the summary table as often as they
like (with the understanding that the figures aren't going to update
any faster than once a minute)

It's the same idea of a materialized view, or possibly just a lazy cache.
Just try this query:
CREATE TABLE cachedview AS
select p.id as person_id, s.*, ss.*
from shipment s
inner join shipment_status ss on s.current_status_id=ss.id
inner join release_code rc on ss.release_code_id=rc.id
left outer join driver d on s.driver_id=d.id
left outer join carrier_code cc on s.carrier_code_id=cc.id
where s.carrier_code_id in (
select cc.id
from person p
 inner join carrier_to_person ctp on p.id=ctp.person_id
 inner join carrier c on ctp.carrier_id=c.id
 inner join carrier_code cc on cc.carrier_id = c.id
)
and s.current_status_id is not null
and s.is_purged=false
and(rc.number='9' )
and(ss.date=current_date-31 )
order by ss.date desc ;
Notice that I took out the internal p.id = blah.
Then you can do:
CREATE INDEX cachedview_person_id_idx ON cachedview(person_id);
Then from the client side, you can just run:
SELECT * from cachedview WHERE person_id = id;
Now, this assumes that rc.number='9' is what you always want. If that
isn't the case, you could refactor a little bit.
This unrolls all of the work, a table which should be really fast to
query. If this query takes less than 10s to generate, than just have a
service run it every 60s. I think for refreshing, it is actually faster
to drop the table and recreate it, rather than deleteing the entries.
Dropping also has the advantage that if you ever add more rows to s or
ss, then the table automatically gets the new entries.
Another possibility, is to have the cachedview not use s.*, ss.*,
but instead just include whatever the primary keys are for those tables.
Then your final query becomes:
SELECT s.*, ss.* FROM cachedview cv, s, ss WHERE cv.person_id = id,
cv.s_id = s.pkey, cv.ss_id = ss.pkey;
Again, this should be really fast, because you should have an index on
cv.person_id and only have say 300 rows there, and then you are just
fetching a few rows from s and ss. You can also use this time to do some
of your left joins against other tables.
Does this make sense? The biggest advantage you have is your 60s
statement. With that in hand, I think you can do a lot of caching
optimizations.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-04 Thread John Arbash Meinel
Ken Egervari wrote:
Josh,
...
I thought about this, but it's very important since shipment and
shipment_status are both updated in real time 24/7/365.  I think I
might be able to cache it within the application for 60 seconds at
most, but it would make little difference since people tend to refresh
within that time anyway. It's very important that real-time
inforamtion exists though.
Is 60s real-time enough for you? That's what it sounds like. It would be
nice if you could have 1hr, but there's still a lot of extra work you
can do in 60s.
You could also always throw more hardware at it. :) If the
shipment_status is one of the bottlenecks, create a 4-disk raid10 and
move the table over.
I don't remember what your hardware is, but I don't remember it being a
quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
log on a solid state disk. :)

That sounds like an awesome system.  I loved to have something like
that. Unfortunately, the production server is just a single processor
machine with 1 GB ram.  I think throwing more disks at it is probably
the best bet, moving the shipment and shipment_status tables over as
you suggested. That's great advice.
Well, disk I/O is one side, but probably sticking another 1GB (2GB
total) also would be a fairly economical upgrade for performance.
You are looking for query performance, not really update performance,
right? So buy a 4-port SATA controller, and some WD Raptor 10k SATA
disks. With this you can create a RAID10 for  $2k (probably like $1k).
30ms is a good target, although I guess I was naive for setting that
goal perhaps.  I've just taken queries that ran at 600ms and with 1 or
2 indexes, they went down to 15ms.
It all depends on your query. If you have a giant table (1M rows), and
you are doing a seqscan for only 5 rows, then adding an index will give
you enormous productivity gains. But you are getting 30k rows, and
combining them with 6k rows, plus a bunch of other stuff. I think we've
tuned the query about as far as we can.
Let's say we have 200 users signed into the application at the same
time. The application refreshes their shipment information
automatically to make sure it's up to date on the user's screen.  The
application will execute the query we are trying to tune every 60
seconds for most of these users.  Users can set the refresh time to be
higher, but 60 is the lowest amount so I'm just assuming everyone has
it at 60.
Anyway, if you have 200 users logged in, that's 200 queries in the 60
second period, which is about 3-4 queries every second.  As you can
see, it's getting maxed out, and because of bad luck, the queries are
bunched together and are being called at the same time, making 8-9
queries in the same second and that's where the performance is
starting to degrade.  I just know that if I could get this down to 30
ms, or even 100, we'd be okay for a few months without throwing
hardware at the problem.   Also keep in mind that other application
logic and Hibernate mapping is occuring to, so 3-4 queries a second is
already no good when everything is running on a single machine.
The other query I just sent, where you do the query for all users at
once, and then cache the result, *might* be cheaper than doing a bunch
of different queries.
However, you may find that doing the query for *all* users takes to
long. So you could keep another table indicating who the most recent
people logged in are, and then only cache the info for those people.
This does start getting a little more involved, so see if you can do all
users before heading down this road.
This isn't the best setup, but it's the best we can afford.  We are
just a new startup company.  Cheaper servers and open source keep our
costs low. But money is starting to come in after 10 months of hard
work, so we'll be able to replace our server within the next 2
months.  It'll be a neccessity because we are signing on some big
clientsnow and they'll have 40 or 50 users for a single company.  If
they are all logged in at the same time, that's a lot of queries.
Sure. Just realize you can't really support 200 concurrent connections
with a single P4 and 1GB of ram.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] name search query speed

2005-03-03 Thread John Arbash Meinel
Jeremiah Jahn wrote:
On Thu, 2005-03-03 at 11:46 -0600, John A Meinel wrote:

...
Not really, about 2% of the returned rows are thrown away for security
reasons based on the current user, security groups they belong to and
different flags in the data itself. So the count for this is generated
on the fly needed for pagination in the app which expresses the total
number of finds, but only displays 40 of them. If any one knows a way to
determine the total number of matches without needing to iterate through
them using jdbc, I'm all ears as this would save me huge amounts of time
and limit/offset would become an option.

Well, what is wrong with select count(*) from the query I would have
done?
Are you saying 2% are thrown away, or only 2% are kept?
Is this being done at the client side? Is there a way to incorporate the
security info into the database, so that the query actually only returns
the rows you care about? That seems like it would be a decent way to
speed it up, if you can restrict the number of rows that it needs to
look at.
There are other alternatives, such as materialized views, or temp
tables, where you select into the temp table the rows that the user
would request, and then you generate limit/offset from that. The first
query would be a little slow, since it would get all the rows, but all
subsequent accesses for that user could be really fast.
The other possibility is to do limit 200, and then in your list of
pages, you could have:
1, 2, 3, 4, 5, ...
This means that you don't have to worry about getting 10,000 entries,
which probably isn't really useful for the user anyway, and you can
still break things into 40 entry pages, just 200 entries at a time.
John
=:-
John
=:-




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Help with tuning this query (with explain analyze finally)

2005-03-03 Thread John Arbash Meinel
Ken Egervari wrote:
Josh,
I did everything you said and my query does perform a bit better.
I've been getting speeds from 203 to 219 to 234 milliseconds now.   I
tried increasing the work mem and the effective cache size from the
values you provided, but I didn't see any more improvement.  I've
tried to looking into setting the shared buffers for Windows XP, but
I'm not sure how to do it.  I'm looking in the manual at:
http://www.postgresql.org/docs/8.0/interactive/kernel-resources.html#SYSVIPC-PARAMETERS

You probably don't need to change anything for Windows. If you set
shared_buffers too high, then postgres won't start. If it is starting,
then you don't need to modify the OS to get more shared buffers. (For
instance, on my Mac, I can't get shared_buffers  500 without changing
things, but on windows I run with 3000 and no modification).
It doesn't mention windows at all.  Does anyone have any ideas on have
to fix this?
Do you need the interior sort? It's taking ~93ms to get 7k rows from
shipment_status, and then another 30ms to sort them. This isn't a lot,
so it might be fine.
Also, did you ever try CLUSTER current_status_date_idx ON shipment_status.
This groups the rows in shipment_status by their status date, which
helps put items with the same date next to eachother. This may effect
other portions of the query, or other queries. Also, if you are
inserting sequentially, it would seem that the items would already be
naturally near eachother based on date.
The next big cost is having to merge the 28k rows with the fast hash
plan, which takes about 80ms.
I guess the biggest issue is that you are doing a lot of work, and it
takes time to do it. Also, I've noticed that this query is being run
with exactly the same data. Which is good to compare two methods. But
remember to test on multiple potential values. You might be better off
one way with this query, but much worse for a different dataset. I
noticed that this seems to have fewer rows than what postgres thinks the
*average* number would be. (It predicts 60k and you only get 28k rows).
If this query is performed a lot, and you can be okay with a slight
delay in updating, you could always switch to some sort of lazy
materialized view.
You could also always throw more hardware at it. :) If the
shipment_status is one of the bottlenecks, create a 4-disk raid10 and
move the table over.
I don't remember what your hardware is, but I don't remember it being a
quad opteron with 16GB ram, and 20 15k SCSI disks, with the transaction
log on a solid state disk. :)
Why do you need the query to be 30ms? ~250ms is still pretty fast. If
you are needing updates faster than that, you might look more into *why*
and then handle it from a higher level.
And naturally, the most important this is to test it under load. 250ms
is pretty good, but if under load it goes back to 6s, then we probably
should look for different alternatives. Also, what is the load that is
causing the problem? Is it that you have some other big seqscans which
are causing all of your tables to go out of cache?
Also, I believe I remember you saying that your production server is a
P4, is that a single P4? Because I know postgres prefers Opterons to
Pentium Xeons when in a multiprocessor machine. Look through the
archives about spinlocks and the context switch bug. (context storm,
etc). Plus, since opterons are 64-bit, you can throw a lot more RAM at
them. I believe opterons outperform xeons for the same cost, *and* you
can scale them up with extra ram.
But remember, the biggest bottleneck is almost *always* the I/O. So put
more  faster disks into the system first.
John
=:-
Here is the new explain analyze.
Sort  (cost=1996.21..1996.26 rows=17 width=165) (actual
time=297.000..297.000 rows=39 loops=1)
 Sort Key: ss.date
 -  Merge Right Join  (cost=1951.26..1995.87 rows=17 width=165)
(actual time=297.000..297.000 rows=39 loops=1)
   Merge Cond: (outer.id = inner.driver_id)
   -  Index Scan using driver_pkey on driver d  (cost=0.00..42.16
rows=922 width=43) (actual time=0.000..0.000 rows=922 loops=1)
   -  Sort  (cost=1951.26..1951.30 rows=17 width=122) (actual
time=297.000..297.000 rows=39 loops=1)
 Sort Key: s.driver_id
 -  Hash Join  (cost=586.48..1950.91 rows=17 width=122)
(actual time=219.000..297.000 rows=39 loops=1)
   Hash Cond: (outer.carrier_code_id = inner.id)
   -  Merge Join  (cost=571.97..1931.95 rows=830
width=87) (actual time=219.000..297.000 rows=310 loops=1)
 Merge Cond: (outer.current_status_id =
inner.id)
 -  Index Scan using
shipment_current_status_id_idx on shipment s  (cost=0.00..2701.26
rows=60307 width=66) (actual time=0.000..62.000 rows=27711 loops=1)
   Filter: (is_purged = false)
 -  Sort  (cost=571.97..576.38 rows=1766
width=21) (actual time=125.000..156.000 rows=6902 loops=1)
   Sort Key: 

Re: [PERFORM] Help with tuning this query

2005-03-02 Thread John Arbash Meinel
Ken Egervari wrote:
First, what version of postgres, and have you run VACUUM ANALYZE
recently?
Also, please attach the result of running EXPLAIN ANALYZE.
(eg, explain analyze select s.* from shipment ...)

I'm using postgres 8.0.  I wish I could paste explain analyze, but I
won't be at work for a few days.  I was hoping some Postgres/SQL
experts here would be able to simply look at the query and make
recommendations because it's not a very difficult or unique query.
That's the problem. Without explain analyze, it's hard to say why it is
performing weird, because it *does* look like a straightforward query.
It's very possible that you don't have up-to-date statistics, which
causes postgres to make a bad estimate of what the fastest plan is.

I run VACUUM ANALYZE religiously.  I even dumped the production
database and used it as my test database after a full vacuum analyze.
It's really as fresh as it can be.
Good. Again, this is just the first precaution, as not everyone is as
careful as you. And without the explain analyze, you can't tell what the
planner estimates are.
I don't know how to make Hibernate do what you want, but if you change
the query to using subselects (not all databases support this, so
hibernate might not let you), you can see a performance improvement.

Yes, Hibernate supports sub-selects.  In fact, I can even drop down to
JDBC explicitly, so whatever SQL tricks out there I can use will work
on Hibernate.  In what way will sub-selects improve this query?
When doing massive joins across multiple tables (as you are doing) it is
frequently faster to do a couple of small joins where you only need a
couple of rows as input to the rest. Something like:
select * from shipment s
where s.carrier_code_id in
   (select cc.id from carrier_code cc join carrier c on
cc.carrier_id = c.id)
and s.current_status_id in (select cs.id from shipment_status cs where ...)
Again it's something that you can try. I have found quite a few of my
queries performed much better with subselects.
I'm guessing it's because with big queries it has a harder time figuring
out how to refactor (the decision tree becomes big). But I'm not really
sure. I just know it can work.
Also sometimes using explicit joins can be worse than just letting the
query manager figure it out. So something like
select s.* from shipment s, carrier_code cc, carrier c, ...
   where s.carrier_code_id = cc.id and c.id = cc.carrier_id and 

I think I can avoid using joins in Hibernate, but it makes the query
harder to maintain.  How much of a performance benefit are we talking
with this change?  Since hibernate is an object language, you don't
actually have to specify many joins.  You can use the dot notation.
I'm not saying this *will* improve performance. It is just something to
try. It very easily could not be worth the overhead.
Query query = session.createQuery(
select shipment  +
from Shipment shipment  +
   inner join
shipment.cargoControlNumber.carrierCode.carrier.persons person  +
   inner join shipment.currentStatus currentStatus  +
   inner join currentStatus.releaseCode releaseCode  +
   left join fetch shipment.currentStatus  +
where  +
   person.id = :personId and  +
   shipment.isPurged = false and  +
   releaseCode.number = '9' and  +
   currentStatus is not null and  +
   currentStatus.date = current_date - 31  +
order by currentStatus.date desc
);
query.setParameter( personId, personId );
query.setFirstResult( firstResult );
query.setMaxResults( maxResults );
return query.list();
As you can see, it's fairly elegant language and maps to SQL quite well.
But again, since this is generated from another program (Hibernate), I
really don't know how you tell it how to tune the SQL. Probably the
biggest non-bug performance improvements are from tuning the SQL.

I agree, but the ones I've tried aren't good enough.  I have made
these indexes that apply to this query as well as others in my from
looking at my SQL scripts.  Many of my queries have really sped up to
14 milliseconds from these indexes.  But I can't make this query run
any faster.
CREATE INDEX carrier_to_person_person_id_idx ON carrier_to_person
USING btree (person_id);
CREATE INDEX carrier_to_person_carrier_id_idx ON carrier_to_person
USING btree (carrier_id);
CREATE INDEX carrier_code_carrier_id_idx ON carrier_code USING btree
(carrier_id);
CREATE INDEX shipment_carrier_code_id_idx ON shipment USING btree
(carrier_code_id);
CREATE INDEX current_status_date_idx ON shipment_status USING btree
(date);
CREATE INDEX shipment_current_status_id_idx ON shipment USING btree
(current_status_id);
CREATE INDEX shipment_status_shipment_id_idx ON shipment_status USING
btree (shipment_id);
Thanks for your responses everyone.  I'll try and get you that explain
analyze.  I'm just not at work at the moment but this is a problem
that I'm simply puzzled and worried 

Re: [PERFORM] multi billion row tables: possible or insane?

2005-03-01 Thread John Arbash Meinel
Markus Schaber wrote:
Hi, John,
John Arbash Meinel schrieb:

I am doing research for a project of mine where I need to store
several billion values for a monitoring and historical tracking system
for a big computer system. My currect estimate is that I have to store
(somehow) around 1 billion values each month (possibly more).

If you have that 1 billion perfectly distributed over all hours of the
day, then you need 1e9/30/24/3600 = 385 transactions per second.

I hope that he does not use one transaction per inserted row.
In your in-house tests, we got a speedup factor of up to some hundred
when bundling rows on insertions. The fastest speed was with using
bunches of some thousand rows per transaction, and running about 5
processes in parallel.

You're right. I guess it just depends on how the data comes in, and what
you can do at the client ends. That is kind of where I was saying put a
machine in front which gathers up the information, and then does a batch
update. If your client can do this directly, then you have the same
advantage.

John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Inheritence versus delete from

2005-02-28 Thread John Arbash Meinel
Sven Willenberger wrote:
Trying to determine the best overall approach for the following
scenario:
Each month our primary table accumulates some 30 million rows (which
could very well hit 60+ million rows per month by year's end). Basically
there will end up being a lot of historical data with little value
beyond archival.

If this statement is true, then 2 seems the best plan.
2) Each month:
SELECT * INTO 3monthsago_dynamically_named_table FROM bigtable WHERE
targetdate  $3monthsago;
DELETE FROM bigtable where targetdate  $3monthsago;
VACUUM ANALYZE bigtable;
pg_dump 3monthsago_dynamically_named_table for archiving;

It seems like this method would force the table to stay small, and would
keep your queries fast.  But if you ever actually *need* the old data,
then you start having problems.
...
I think (3) would tend to force a whole bunch of joins (one for each
child table), rather than just one join against 3months of data.
Any takes on which approach makes most sense from a performance and/or
maintenance point of view and are there other options I may have missed?
Sven Willenberger

If you can get away with it 2 is the best.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Peformance Tuning Opterons/ Hard Disk Layout

2005-02-23 Thread John Arbash Meinel
John Allgood wrote:
This some good info. The type of attached storage is a Kingston 14 bay
Fibre Channel Infostation. I have 14 36GB 15,000 RPM drives. I think
the way it is being explained that I should build a mirror with two
disk for the pg_xlog and the striping and mirroring the rest and put
all my databases into one cluster. Also I might mention that I am
running clustering using Redhat Clustering Suite.

So are these 14-disks supposed to be shared across all of your 9 databases?
It seems to me that you have a few architectural issues here.
First, you can't really have 2 masters writing to the same disk array.
I'm not sure if Redhat Clustering gets around this. But second is that
you can't run 2 postgres engines on the same database. Postgres doesn't
support a clustered setup. There are too many issues with concurancy and
keeping everyone in sync.
Since you seem to be okay with having a bunch of smaller localized
databases, which update a master database 1/day, I would think you would
want hardware to go something like this.
1 master server, at least dual opteron with access to lots of disks
(likely the whole 14 if you can get away with it). Put 2 as a RAID1 for
the OS, 4 as a RAID10 for pg_xlog, and then the other 8 as RAID10 for
the rest of the database.
8-9 other servers, these don't need to be as powerful, since they are
local domains. Probably a 4-disk RAID10 for the OS and pg_xlog is plenty
good, and whatever extra disks you can get for the local database.
The master database holds all information for all domains, but the other
databases only hold whatever is the local information. Every night your
script sequences through the domain databases one-by-one, updating the
master database, and synchronizing whatever data is necesary back to the
local domain. I would guess that this script could actually just
continually run, going to each local db in turn, but you may want
nighttime only updating depending on what kind of load they have.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] VACUUM ANALYZE slows down query

2005-02-17 Thread John Arbash Meinel
werner fraga wrote:
Certain queries on my database get slower after
running a VACUUM ANALYZE. Why would this happen, and
how can I fix it?
I am running PostgreSQL 7.4.2 (I also seen this
problem on v. 7.3 and 8.0)
Here is a sample query that exhibits this behaviour
(here the query goes from 1 second before VACUUM
ANALYZE to 2 seconds after; there are other queries
that go from 20 seconds before to 800 seconds after):

First, try to attach your explain analyze as a textfile attachment,
rather than inline to prevent wrapping and make it easier to read.
Second, the problem is that it *is* getting a more accurate estimate of
the number of rows that are going to be returned, compare:
Plan 1:
-
Nested Loop Left Join  (cost=3974.74..48055.42
rows=79 width=8) (actual time=359.751..1136.165
rows=1518 loops=1)

The planner was expecting 79 rows, but was actually getting 1518.
Plan 2:
---
Merge Left Join  (cost=18310.59..29162.44 rows=1533
width=8) (actual time=1886.942..2183.774 rows=1518
loops=1)

It is predicting 1533 rows, and found 1518, a pretty good guess.
So the big issue is why does the planner think that a nested loop is
going to be more expensive than a merge join. That I don't really know.
I'm guessing some parameters like random_page_cost could be tweaked, but
I don't really know the criteria postgres uses for merge joins vs nested
loop joins.
Thanks for any assistance.
Walt

Hopefully someone can help a little better. In the mean time, you might
want to resend with an attachment. I know I had trouble reading your
explain analyze.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance Tuning

2005-02-09 Thread John Arbash Meinel
Chris Kratz wrote:
Hello All,
In contrast to what we hear from most others on this list, we find our
database servers are mostly CPU bound.  We are wondering if this is because
we have postgres configured incorrectly in some way, or if we really need
more powerfull processor(s) to gain more performance from postgres.

If everything is cached in ram, it's pretty easy to be CPU bound. You
very easily could be at this point if your database is only 2.6G and you
don't touch all the tables often.
I do believe that when CPU bound, the best thing to do is get faster CPUs.
...
Our question is simply this, is it better to invest in a faster processor at
this point, or are there configuration changes to make it faster?  I've done
some testing with with 4x SCSI 10k and the performance didn't improve, in
fact it actually was slower the the sata drives marginally.  One of our
developers is suggesting we should compile postgres from scratch for this
particular processor, and we may try that.  Any other ideas?
-Chris
On this particular development server, we have:
Athlon XP,3000
1.5G Mem
4x Sata drives in Raid 0

I'm very surprised you are doing RAID 0. You realize that if 1 drive
goes out, your entire array is toast, right? I would recommend doing
either RAID 10 (0+1), or even Raid 5 if you don't do a lot of writes.
Probably most important, though is to look at the individual queries and
see what they are doing.
Postgresql 7.4.5 installed via RPM running on Linux kernel 2.6.8.1
Items changed in the postgresql.conf:
tcpip_socket = true
max_connections = 32
port = 5432
shared_buffers = 12288  # min 16, at least max_connections*2, 8KB each
sort_mem=16384
vacuum_mem = 32768  # min 1024, size in KB
max_fsm_pages = 6   # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000# min 100, ~50 bytes each
effective_cache_size = 115200   # typically 8KB each
random_page_cost = 1# units are one sequential page fetch cost

Most of these seem okay to me, but random page cost is *way* too low.
This should never be tuned below 2.  I think this says an index scan of
*all* rows is as cheap as a sequential scan of all rows. and that
should never be true.
What could actually be happening is that you are getting index scans
when a sequential scan would be faster.
I don't know what you would see, but what does explain analyze select
count(*) from blah; say. If it is an index scan, you have your machine
mistuned. select count(*) always grabs every row, and this is always
cheaper with a sequential scan.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Are JOINs allowed with DELETE FROM

2005-02-06 Thread John Arbash Meinel
Gaetano Mendola wrote:
Steven Rosenstein wrote:
DELETE FROM detail JOIN summary ON (summary.id=detail.sum_id) WHERE
collect_date='2005-02-05';

You have to tell it what table you are deleting from. Select * from A
join B is both tables. What you want to do is fix the where clause.
DELETE FROM detail WHERE detail.sum_id in ( select id from summary  )
AND collect_date='2005-02-05';
I'm guessing this should actually be
DELETE FROM detail WHERE detail.sum_id in ( SELECT id FROM summary WHERE
collect_date='2005-02-05' );
Otherwise you wouldn't really need the join.
You have to come up with a plan that yields rows that are in the table
you want to delete. The rows that result from
select * from detail join summary, contain values from both tables.
If you want to delete from both tables, I think this has to be 2
deletes. Probably best to be in a transaction.
BEGIN;
DELETE FROM detail WHERE ...
DELETE FROM summary WHERE collect_date = '2005-02-05';
COMMIT;
Regards
Gaetano Mendola
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Why the difference in query plan and performance pg

2005-02-01 Thread John Arbash Meinel
Joost Kraaijeveld wrote:
Hi all,
I have a freshly vacuumed table with 1104379 records with a index on zipcode. 
Can anyone explain why the queries go as they go, and why the performance 
differs so much (1 second versus 64 seconds, or stated differently,  1 
records per second versus 1562 records per second) and why the query plan of 
query 2 ignores the index?


Indexes are generally only faster if you are grabbing 10% of the table.
Otherwise you have the overhead of loading the index into memory, and
then paging through it looking for the entries.
With 100,000 entries a sequential scan is actually likely to be faster
than an indexed one.
If you try:
select a.ordernumer from orders a order by a.zipcode
how long does it take?
You can also try disabling sequential scan to see how long Query 2 would
be if you used indexing. Remember, though, that because of caching, a
repeated index scan may seem faster, but in actual production, that
index may not be cached, depending on what other queries are done.
John
=:-
For completeness sake I also did a select ordernumber without any ordering. 
That only took 98 second for 1104379 record (11222 record per second, 
compariable with the first query as I would have expected).
Query 1:
select a.ordernumer from orders a order by a.zipcode limit 1
Explain:
QUERY PLAN
Limit  (cost=0.00..39019.79 rows=1 width=14)
 -  Index Scan using orders_postcode on orders a  (cost=0.00..4309264.07 
rows=1104379 width=14)
Running time: 1 second
Query 2:
select a.ordernumer from orders a order by a.zipcode limit 10
Explain:
QUERY PLAN
Limit  (cost=207589.75..207839.75 rows=10 width=14)
 -  Sort  (cost=207589.75..210350.70 rows=1104379 width=14)
   Sort Key: postcode
   -  Seq Scan on orders a  (cost=0.00..46808.79 rows=1104379 width=14)
Running time: 64 seconds
Query 3:
select a.ordernumer from orders a
QUERY PLAN
Seq Scan on orders a  (cost=0.00..46808.79 rows=1104379 width=4)
Running time: 98 seconds
Groeten,
Joost Kraaijeveld
Askesis B.V.
Molukkenstraat 14
6524NB Nijmegen
tel: 024-3888063 / 06-51855277
fax: 024-3608416
e-mail: [EMAIL PROTECTED]
web: www.askesis.nl
---(end of broadcast)---
TIP 8: explain analyze is your friend




signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] poor performance of db?

2005-01-24 Thread John Arbash Meinel
SpaceBallOne wrote:
Thanks for the reply John,
There are approximately 800 rows total in our job table (which stays
approximately the same because 'completed' jobs get moved to a
'job_archive' table).The other jobs not shown by the specific query
could be on backorder status, temporary deleted status, etc etc.
You are correct in assuming the _id and _no (stands for 'number')
fields are unique - this was one of the first pages I built when I
started learning postgres, so not knowing how to set up primary and
foriegn keys at the time, I did it that way ... it is normalised to a
point (probably rather sloppy, but its a juggling act between learning
on the fly, what I'd like to have, and time constraints of being the
only I.T. guy in the company!)...
I think I will definitely focus on converting my database and php
pages to using proper primary keys in postgres - especially if they
automatically index themselves. I didn't do a vacuum analyse on them
so that may explain why they didn't seem to do much.

You probably can add them now if you don't want to do a lot of redesign.
ALTER TABLE job ADD PRIMARY KEY (id);
If they are not unique this will cause problems, but as they should be
unique, I think it will work.
I'm not sure how much help indexes will be if you only have 800 rows,
and your queries use 300+ of them.
You might need re-think the query/table design.
You might try doing nested queries, or explicit joins, rather than one
big query with a WHERE clause.
Meaning do stuff like:
SELECT
 (job JOIN customer ON job.customer_no = customer.customer_no) as jc
 JOIN location on jc.location_no = location.location_no
...
I also see that the planner seems to mis-estimate the number of rows in
some cases. Like here:
  -  Hash  (cost=14.53..14.53 rows=853 width=19) (actual
time=9.79..9.79 rows=0 loops=1)
-  Seq Scan on street  (cost=0.00..14.53 rows=853
width=19) (actual time=0.01..5.12 rows=853 loops=1)
-  Hash  (cost=9.91..9.91 rows=491 width=20) (actual
time=5.73..5.73 rows=0 loops=1)
  -  Seq Scan on ubd  (cost=0.00..9.91 rows=491 width=20)
(actual time=0.02..2.98 rows=491
Where it thinks the hash will return all of the rows from the sequential
scan, when in reality it returns none.
I think problems with the planner fall into 3 categories.
  1. You didn't VACUUM ANALYZE.
  2. You did, but the planner doesn't keep sufficient statistics (ALTER
 TABLE job ALTER COLUMN no SET STATISTICS a number)
  3. You're join needs cross column statistics, which postgres doesn't
 support (yet).
If you only have 800 rows, I don't think you have to worry about
statistics, so that leaves things at 1 or 3. If you did do 1, then I
don't know what to tell you.
John
=:-
PS I'm not a guru at this stuff, so some of what I say may be wrong.
But hopefully I point you in the right direction.

Thanks,
Dave
[EMAIL PROTECTED]



signature.asc
Description: OpenPGP digital signature