Re: [despammed] [SQL] private table

2005-01-26 Thread Andreas Kretschmer
am  26.01.2005, um 10:13:52 +0200 mailte Din Adrian folgendes:
> Hello,
> I am want to use a private table in postgresql(every client to see his own  
> data).
> Is this possible? How can I do it!

Why?

You can create different users and/or different databases.


Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47212,   D1: 0160/7141639 (mehr: -> Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net
eMail schreiben kann jeder -- lernen: http://webserv/email/email.html

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[SQL] private table

2005-01-26 Thread Din Adrian
Hello,
I am want to use a private table in postgresql(every client to see his own  
data).
Is this possible? How can I do it!

Thank you,
Adrian Din
--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [SQL] Sorry I see my first question did not get posted (maybe

2005-01-26 Thread Richard Huxton
Joel Fradkin wrote:
Basically the question was why would a view use an indexed search on one
result set but a seq search on a larger result set. Same view only
difference is how many rows are returned. The large result set was doing a
seq search and did not return after several minutes. The same sql ran in 135
seconds on my MSSQL system.
Accessing an index 1000 times then reading 1000 rows may be slower than 
just reading a whole table of 2000 rows.

You can examine what PostgreSQL thinks the query will cost by running an 
explain: EXPLAIN ANALYSE SELECT ...
This will display two sets of figures for each stage, the expected costs 
and the actual.

Finally, make sure your configuration settings are reasonable. Read 
through the guide at:
  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php

There's a -performance list that specialises in dealing with these 
issues. If you post there, mention you've tuned as per GeneralBits and 
provide an example of the query, view definition and the output from 
explain.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] How to find out programmatically whether a query on a view will use an index?

2005-01-26 Thread Martin Schäfer
> > I'm using the PostGIS spatial extension. Some of my spatial 
> queries (like 
> > live zooming and panning) should only be performed when the column 
> > containing the spatial data is spatially indexed, otherwise 
> the first query 
> > takes almost forever and users will just kill the 
> application out of 
> > frustration.
> 
> If the real problem is long running queries, maybe using a 
> statement timeout
> will solve your problem?

Using a timeout to test for the presence of an index is not exact enough: I 
can't guard myself against false positives or false negatives. If the server is 
very busy at the moment all views might seem to be 'unindexed', i.e. unusable 
for live zooming and panning. The next day it might look different.

I need to know in advance whether the queries would use a spatial index on the 
views. If no spatial index would be used, I have to make a local copy of (a 
subset of) the view (or table), create a local index and display the local copy 
instead. This is better than waiting for the timeout to expire and display 
nothing.

With Oracle I can fire a spatial query on a view or table, and if the spatial 
column is not indexed, the entire query will fail. Unfortunately, with 
PostgreSQL, the spatial queries always succeed.

Martin

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] private table

2005-01-26 Thread Richard Huxton
Din Adrian wrote:
Hello,
I am want to use a private table in postgresql(every client to see his 
own  data).
Is this possible? How can I do it!
If I understand what you want, then this is traditionally done using views.
CREATE TABLE base_table (
  a integer NOT NULL,
  b text,
  u name,
  PRIMARY KEY (a)
);
CREATE VIEW my_rows AS
  SELECT a,b FROM base_table WHERE u = CURRENT_USER;
Add rules so that updating my_rows updates base_table instead. Then, 
revoke access on base_table (apart from your super-user) but grant it on 
my_rows.

As far as a user is concerned "my_rows" just contains their rows and if 
they log in as a different user they will see different data.
--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [SQL] working with multidimensional arrays in plpgsql

2005-01-26 Thread Richard Huxton
Sibtay Abbas wrote:
hello everyone
i am having problem with multidimensional arrays in plpgsql following
is the source code of the function which i am trying to run

DECLARE
  x INTEGER[10][10];

x[3][1] := '20';   --i have even tried x[3][1] = 20

As you might have observed here, the actual problem is
how to do assignment to multidimensional array locations using the
subscript operater.
Actually, the problem is that the array "x" has been initialised to 
NULL. Therefore, you can't set individual values. Try a line like:
  x := '{}';

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-26 Thread Andrei Bintintan
The problems still stays open.
The thing is that I have about 20 - 30 clients that are using that SQL query 
where the offset and limit are involved. So, I cannot create a temp table, 
because that means that I'll have to make a temp table for each session... 
which is a very bad ideea. Cursors somehow the same. In my application the 
Where conditions can be very different for each user(session) apart.

The only solution that I see in the moment is to work at the query, or to 
write a more complex where function to limit the results output. So no 
replace for Offset/Limit.

Best regards,
Andy.
- Original Message - 
From: "Greg Stark" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Cc: "Richard Huxton" ; "Andrei Bintintan" 
<[EMAIL PROTECTED]>; ; 

Sent: Tuesday, January 25, 2005 8:28 PM
Subject: Re: [PERFORM] [SQL] OFFSET impact on Performance???


Alex Turner <[EMAIL PROTECTED]> writes:
I am also very interesting in this very question.. Is there any way to
declare a persistant cursor that remains open between pg sessions?
This would be better than a temp table because you would not have to
do the initial select and insert into a fresh table and incur those IO
costs, which are often very heavy, and the reason why one would want
to use a cursor.
TANSTAAFL. How would such a persistent cursor be implemented if not by
building a temporary table somewhere behind the scenes?
There could be some advantage if the data were stored in a temporary table
marked as not having to be WAL logged. Instead it could be automatically
cleared on every database start.
--
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] working with multidimensional arrays in plpgsql

2005-01-26 Thread Sibtay Abbas
thank you very much for your reply
I did as you specified and it worked fine
thankz :)


On Wed, 26 Jan 2005 09:29:53 +, Richard Huxton  wrote:
> Sibtay Abbas wrote:
> > hello everyone
> >
> > i am having problem with multidimensional arrays in plpgsql following
> > is the source code of the function which i am trying to run
> 
> 
> > DECLARE
> >   x INTEGER[10][10];
> 
> > x[3][1] := '20';   --i have even tried x[3][1] = 20
> 
> > As you might have observed here, the actual problem is
> > how to do assignment to multidimensional array locations using the
> > subscript operater.
> 
> Actually, the problem is that the array "x" has been initialised to
> NULL. Therefore, you can't set individual values. Try a line like:
>x := '{}';
> 
> --
>Richard Huxton
>Archonet Ltd
>

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Joel Fradkin
QUERY PLAN
"Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
time=0.344..962.260 rows=22636 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 1034.434 ms"

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 26, 2005 1:27 AM
To: Joel Fradkin
Cc: [email protected]
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres


"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> I also tried a simple select * from tblcase where clientum = 'SAKS'

Try:

explain analyze select * from tblcase where clientum = 'SAKS'

Send the output.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] [SQL] OFFSET impact on Performance???

2005-01-26 Thread Richard Huxton
Alex Turner wrote:
As I read the docs, a temp table doesn't solve our problem, as it does
not persist between sessions.  With a web page there is no guarentee
that you will receive the same connection between requests, so a temp
table doesn't solve the problem.  It looks like you either have to
create a real table (which is undesirable becuase it has to be
physicaly synced, and TTFB will be very poor) or create an application
tier in between the web tier and the database tier to allow data to
persist between requests tied to a unique session id.
Looks like the solutions to this problem is not RDBMS IMHO.
It's less the RDBMS than the web application. You're trying to mix a 
stateful setup (the application) with a stateless presentation layer 
(the web). If you're using PHP (which doesn't offer a "real" middle 
layer) you might want to look at memcached.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Richard Huxton
Joel Fradkin wrote:
QUERY PLAN
"Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
time=0.344..962.260 rows=22636 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 1034.434 ms"
That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the 
original post. You'll never persuade PG to use the index when some 75% 
of your rows match the filter - it just doesn't make sense.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [SQL] Moving from Transact SQL to PL/pgsql

2005-01-26 Thread George Essig
> ...
>  Where can I find
>  primer on PL/pgsql, with lots of examples? 
> ...

Download openacs at http://openacs.org/projects/openacs/download/.  Look at 
directories matching
the pattern openacs-*/packages/*/sql/postgresql/.  The older openacs version 
4.6.3 has more
examples than openacs 5.

George Essig

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Richard Huxton
Joel Fradkin wrote:
Well last evening (did not try it this morning) it was taking the extra
time.
I have made some adjustments to the config file per a few web sites that you
all recommended my looking at.
The crucial one I'd say is the performance guide at:
  http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.
It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).
That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?

The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
recs of which only 22636 are clientnum = 'SAKS'
That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).

I am still doing a seq search (this applies to the view question where if it
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get a
result in 140 secs (MSSQL was 135 secs).
If you want to check whether the index would help, try issuing the 
following before running your query:
  SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.

This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying its
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.
I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).

I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been a
huge resource. I really appreciate all the help.
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Joel Fradkin
Well last evening (did not try it this morning) it was taking the extra
time.

I have made some adjustments to the config file per a few web sites that you
all recommended my looking at.

It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).

The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
recs of which only 22636 are clientnum = 'SAKS'

I am still doing a seq search (this applies to the view question where if it
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get a
result in 140 secs (MSSQL was 135 secs).

This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying its
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been a
huge resource. I really appreciate all the help.


Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the 
original post. You'll never persuade PG to use the index when some 75% 
of your rows match the filter - it just doesn't make sense.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: ***SPAM*** Re: [SQL] same question little different test MSSQL

2005-01-26 Thread Franco Bruno Borghesi




I've tested in a relation of mine, with about 20 attributes, and here
are the results:

test=# select count(*) from gestionestareas;
 count

 447681
(1 row)

test=# explain analyze select * from gestionestareas where agrupable;
 QUERY PLAN

 Seq Scan on gestionestareas  (cost=0.00..12334.81 rows=155495
width=372) (actual time=0.603..1176.177 rows=153530 loops=1)
   Filter: agrupable
 Total runtime: 1380.113 ms
(3 rows)

So, doing a seq scan on a 450,000 rows table and fetching 150,000 rows
takes only 1.3 secs. This a 900Mhz PIII, 1GB mem (133 Mhz), 7200RPM ide
disk, running freebsd. This machine is also a webserver (apache &
tomcat), mail server, file server (smb & nfs), ldap server, etc.

I don't  use pgadmin, I use psql (postgresql console client). It took 2
minutes to display the results of the above query (without the EXPLAIN
ANALIZE) this makes me think, couldn't be the problem that pgadmin,
psql, etc. takes too much time to display all the rows? It seems a
client software problem, not a server problem.

My advice is, use EXPLAIN ANALYZE to test both servers performance. If
you want to know which the final results will be, you test both
databases from a self programmed application (java, php, C++, etc.).

Hope this helped.

Joel Fradkin wrote:

  Well last evening (did not try it this morning) it was taking the extra
time.

I have made some adjustmenNots to the config file per a few web sites that you
all recommended my looking at.

It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).

The EXPLAIN ANALYSE still shows the same as below, but the table has 344,000
recs of which only 22636 are clientnum = 'SAKS'

I am still doing a seq search (this applies to the view question where if it
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get a
result in 140 secs (MSSQL was 135 secs).

This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying its
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been a
huge resource. I really appreciate all the help.


Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
  
  
QUERY PLAN
"Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
time=0.344..962.260 rows=22636 loops=1)"
"  Filter: ((clientnum)::text = 'SAKS'::text)"
"Total runtime: 1034.434 ms"

  
  
That's only 1 second - to return 22,636 rows. Not 27 seconds, as in the 
original post. You'll never persuade PG to use the index when some 75% 
of your rows match the filter - it just doesn't make sense.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

  






Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Joel Fradkin
The postgres is running on Linux Fedora core 3 (production will be redhat on
Dell 4 proc 8 gig box).

My client pgadminIII is running on XP.

Sorry I was not clearer on this.

I am playing with the settings now, I got it to return in 100 secs (the view
that is that took 135 on MSSQL). My testing is using identical Dell desktops
for the MSSQL and the Linux, with a third machine for the clients.

I do not mind getting up to speed on the proper setting to optimize the
hardware, I am worried that as production environment can be somewhat
dynamic that I will have issues getting a optimized environment and that it
will work for our needs. My whole reason for being here is that our duel
proc production MSSQL server is just no longer keeping up with the demand,
so it is important that whatever I implement is going to up to the
challenge. I am still convinced Postgres was the correct choice, especially
with all the guidance I have been able to get here. 100 seconds will be fine
compared to the 135 of MSSQL, I just was getting worse responses before
adjusting. At the moment I think I went too far as I see it using swap and
going slower, but it never used much of the 756 meg (137 max was all I ever
saw it use).

I guess the swap buffers and cache are the important settings (least that
seems to be what is affecting the memory). Not sure exactly what would cause
it to use seq vrs index, but I will try the force and see if it helps the
speed.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: [EMAIL PROTECTED]; [email protected]
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote:
> Well last evening (did not try it this morning) it was taking the extra
> time.
> 
> I have made some adjustments to the config file per a few web sites that
you
> all recommended my looking at.

The crucial one I'd say is the performance guide at:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

> It is now using 137 of 756 meg avail.
> it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
> edit window).

That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?

> The EXPLAIN ANALYSE still shows the same as below, but the table has
344,000
> recs of which only 22636 are clientnum = 'SAKS'

That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).

> I am still doing a seq search (this applies to the view question where if
it
> is a small result set it used a index search but on a larger return set it
> did a seq search) in my view, but with the adjustments to the kernel I get
a
> result in 140 secs (MSSQL was 135 secs).

If you want to check whether the index would help, try issuing the 
following before running your query:
   SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.

> This is not production, I am still very worried that I have to do all this
> tweeking to use this, MSSQL worked out of the box as it does (not saying
its
> great, but I never had to adjust a kernel setting etc). Since we cannot
> afford the 70,000 dollars they want to license it I am not implying I can
> use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).

> I have a lot of time now (two weeks) in this conversion and do not wish to
> give up, I will see if I can learn what is needed to get the maximum
> performance. I have seen much information available an

Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Greg Stark
"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> QUERY PLAN
> "Seq Scan on tblcase  (cost=0.00..30066.21 rows=37401 width=996) (actual
> time=0.344..962.260 rows=22636 loops=1)"
> "  Filter: ((clientnum)::text = 'SAKS'::text)"
> "Total runtime: 1034.434 ms"

Well that says it only took 1s. So it seems this is highly dependent on
whether the data is in cache. Perhaps it was in cache on MSSQL when you
profiled it there and not on postgres?

You could put an index on clientnum, but if the data is usually in cache like
this it might not even be necessary.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Joel Fradkin
I tried the SET ENABLE_SEQSCAN=FALSE;
And the result took 29 secs instead of 117.

After playing around with the cache and buffers etc I see I am no longer
doing any swapping (not sure how I got the 100 sec response might have been
shared buffers set higher, been goofing around with it all morning).

My worry here is it should obviously use an index scan so something is not
setup correctly yet. I don't want to second guess the analyzer (or is this a
normal thing?)

Least it is blowing the doors off MSSQL (which is what I touted to my boss
and was pretty upset when I got no result last night).

The 117 was before I forced the seq off so even doing a seq I am getting
results now that are better then MSSQL.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: [EMAIL PROTECTED]; [email protected]
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote:
> Well last evening (did not try it this morning) it was taking the extra
> time.
> 
> I have made some adjustments to the config file per a few web sites that
you
> all recommended my looking at.

The crucial one I'd say is the performance guide at:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

> It is now using 137 of 756 meg avail.
> it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
> edit window).

That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?

> The EXPLAIN ANALYSE still shows the same as below, but the table has
344,000
> recs of which only 22636 are clientnum = 'SAKS'

That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).

> I am still doing a seq search (this applies to the view question where if
it
> is a small result set it used a index search but on a larger return set it
> did a seq search) in my view, but with the adjustments to the kernel I get
a
> result in 140 secs (MSSQL was 135 secs).

If you want to check whether the index would help, try issuing the 
following before running your query:
   SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.

> This is not production, I am still very worried that I have to do all this
> tweeking to use this, MSSQL worked out of the box as it does (not saying
its
> great, but I never had to adjust a kernel setting etc). Since we cannot
> afford the 70,000 dollars they want to license it I am not implying I can
> use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).

> I have a lot of time now (two weeks) in this conversion and do not wish to
> give up, I will see if I can learn what is needed to get the maximum
> performance. I have seen much information available and this list has been
a
> huge resource. I really appreciate all the help.

--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Greg Stark
"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> I tried the SET ENABLE_SEQSCAN=FALSE;
> And the result took 29 secs instead of 117.
> 
> After playing around with the cache and buffers etc I see I am no longer
> doing any swapping (not sure how I got the 100 sec response might have been
> shared buffers set higher, been goofing around with it all morning).

If it's swapping you're definitely going to get bad results. You really want
the *majority* of RAM left free for the OS to cache disk data.

> My worry here is it should obviously use an index scan so something is not
> setup correctly yet. I don't want to second guess the analyzer (or is this a
> normal thing?)

No that's not obvious. 22k out of 344k is a selectivity of 6.6% which is
probably about borderline. The optimizer is estimating even worse at 10.9%
which isn't far off but puts it well out of the range for an index scan.

If you really want to get postgres using an index scan you'll have to a)
improve the estimate using "alter table tblcase alter column clientnum set
statistics" to raise the statistics target for that column and reanalyze. 

And b) lower random_page_cost. random_page_cost tells postgres how much slower
indexes are than table scans and at the default setting it accurately
represents most disk hardware. If your database fits within RAM and is often
cached then you might have to lower it to model that fact. But you shouldn't
do it based on a single query like this.


-- 
greg


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Richard Huxton
Joel Fradkin wrote:
The postgres is running on Linux Fedora core 3 (production will be redhat on
Dell 4 proc 8 gig box).
My client pgadminIII is running on XP.
Sorry I was not clearer on this.
Ah! you're the gent who had the problems with SE-Linux on Fedora 3. 
Sorry - should have made the connection, but there's so much traffic on 
the lists it's easy to miss.

I am playing with the settings now, I got it to return in 100 secs (the view
that is that took 135 on MSSQL). My testing is using identical Dell desktops
for the MSSQL and the Linux, with a third machine for the clients.
I do not mind getting up to speed on the proper setting to optimize the
hardware, I am worried that as production environment can be somewhat
dynamic that I will have issues getting a optimized environment and that it
will work for our needs. My whole reason for being here is that our duel
proc production MSSQL server is just no longer keeping up with the demand,
so it is important that whatever I implement is going to up to the
challenge.
You might want to look at the overall design of the database at some 
point too. Also, don't forget the compromises you made when designing 
for MSSQL might not be useful (or even harmful) with PG.

> I am still convinced Postgres was the correct choice, especially
with all the guidance I have been able to get here. 100 seconds will be fine
compared to the 135 of MSSQL, I just was getting worse responses before
adjusting. At the moment I think I went too far as I see it using swap and
going slower, but it never used much of the 756 meg (137 max was all I ever
saw it use).
If you're on Linux then 135MB sounds like too much (for one client, far 
too much).

I guess the swap buffers and cache are the important settings (least that
seems to be what is affecting the memory). Not sure exactly what would cause
it to use seq vrs index, but I will try the force and see if it helps the
speed.
Try starting with your shared-buffers at say 4000-8000 (32MB to 64MB), 
sort-mem/work-mem at 8000-32000 (8MB-32MB), random-page-cost somewhere 
between 2 and 4. Then, judge how much RAM your box is using to cache 
disk-space (free -m) and set effective-cache-size accordingly. That's it 
- you may want to play around with the figures slightly, but pick the 
lowest numbers above and restart PG and it'll run OK.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Joel Fradkin
Thank you I will look at that info.
I did do an EXPLAIN ANALYSE on the view and could see it was doing the seq
scan on 3 fields, so I did an index for the three fields and it then chose
an index scan and ran in 27 seconds.

I also did adjust my defaults to much smaller numbers on shared buffers (per
the tidbits page recommendation like 8 meg for my memory size). I looked at
http://www.desknow.com/kb/idx/0/061/article/ which recommended doing a
vacuum verbose to determine the exact max_fsm_pages and I set the cache to
use 25% of my available memory per the recommendation on tid bits.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, January 26, 2005 11:50 AM
To: Joel Fradkin
Cc: 'Richard Huxton'; [EMAIL PROTECTED]; [email protected];
[EMAIL PROTECTED]; Steve Goldsmith
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

"Joel Fradkin" <[EMAIL PROTECTED]> writes:

> I tried the SET ENABLE_SEQSCAN=FALSE;
> And the result took 29 secs instead of 117.
> 
> After playing around with the cache and buffers etc I see I am no longer
> doing any swapping (not sure how I got the 100 sec response might have
been
> shared buffers set higher, been goofing around with it all morning).

If it's swapping you're definitely going to get bad results. You really want
the *majority* of RAM left free for the OS to cache disk data.

> My worry here is it should obviously use an index scan so something is not
> setup correctly yet. I don't want to second guess the analyzer (or is this
a
> normal thing?)

No that's not obvious. 22k out of 344k is a selectivity of 6.6% which is
probably about borderline. The optimizer is estimating even worse at 10.9%
which isn't far off but puts it well out of the range for an index scan.

If you really want to get postgres using an index scan you'll have to a)
improve the estimate using "alter table tblcase alter column clientnum set
statistics" to raise the statistics target for that column and reanalyze. 

And b) lower random_page_cost. random_page_cost tells postgres how much
slower
indexes are than table scans and at the default setting it accurately
represents most disk hardware. If your database fits within RAM and is often
cached then you might have to lower it to model that fact. But you shouldn't
do it based on a single query like this.


-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: ***SPAM*** Re: [SQL] same question little different test MSSQL

2005-01-26 Thread Franco Bruno Borghesi




Maybe you should tweak the cpu_index_tuple_cost parameter instead of
disabling sequential scans. De default value is 0.001, you should
change it to a lower value (0.0005 or something).

Joel Fradkin wrote:

  I tried the SET ENABLE_SEQSCAN=FALSE;
And the result took 29 secs instead of 117.

After playing around with the cache and buffers etc I see I am no longer
doing any swapping (not sure how I got the 100 sec response might have been
shared buffers set higher, been goofing around with it all morning).

My worry here is it should obviously use an index scan so something is not
setup correctly yet. I don't want to second guess the analyzer (or is this a
normal thing?)

Least it is blowing the doors off MSSQL (which is what I touted to my boss
and was pretty upset when I got no result last night).

The 117 was before I forced the seq off so even doing a seq I am getting
results now that are better then MSSQL.

Joel Fradkin
 
Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel.  941-753-7111 ext 305
 
[EMAIL PROTECTED]
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
 This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information.  Any unauthorized review,
use, disclosure or distribution is prohibited.  If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.
 

 

-Original Message-
From: Richard Huxton [mailto:[email protected]] 
Sent: Wednesday, January 26, 2005 10:21 AM
To: Joel Fradkin
Cc: [EMAIL PROTECTED]; [email protected]
Subject: Re: [SQL] same question little different test MSSQL vrs Postgres

Joel Fradkin wrote:
  
  
Well last evening (did not try it this morning) it was taking the extra
time.

I have made some adjustments to the config file per a few web sites that

  
  you
  
  
all recommended my looking at.

  
  
The crucial one I'd say is the performance guide at:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
The first half-dozen settings are the crucial ones.

  
  
It is now using 137 of 756 meg avail.
it is now taking 8 secs to return 22,000 rows (using pgadminIII in a sql
edit window).

  
  
That might be too much RAM. Don't forget PG likes to work with your 
operating-system (unlike many other DBs). Make sure Windows is using 
enough RAM to cache diskspace.
I'm curious as to how this takes 8secs whereas you had 1 second earlier. 
Are you sure some of this isn't pgadmin's overhead to display the rows?

  
  
The EXPLAIN ANALYSE still shows the same as below, but the table has

  
  344,000
  
  
recs of which only 22636 are clientnum = 'SAKS'

  
  
That sounds like it's about the borderline between using an index and 
not (depending on cache-size, disk speeds etc).

  
  
I am still doing a seq search (this applies to the view question where if

  
  it
  
  
is a small result set it used a index search but on a larger return set it
did a seq search) in my view, but with the adjustments to the kernel I get

  
  a
  
  
result in 140 secs (MSSQL was 135 secs).

  
  
If you want to check whether the index would help, try issuing the 
following before running your query:
   SET ENABLE_SEQSCAN=FALSE;
This will force PG to use any index it can regardless of whether it 
thinks it will help.

  
  
This is not production, I am still very worried that I have to do all this
tweeking to use this, MSSQL worked out of the box as it does (not saying

  
  its
  
  
great, but I never had to adjust a kernel setting etc). Since we cannot
afford the 70,000 dollars they want to license it I am not implying I can
use MSSQL, but I could look at other DB's like MYSQL, or Firebird, etc.

  
  
I'm a little curious what kernel settings you are changing on Windows. I 
wasn't aware there was much to be done there.

I'm afraid you do have to change half a dozen settings in 
postgresql.conf to match your workload, but PG runs on a much wider 
range of machines than MSSQL so it's difficult to come up with a 
"reasonable" default. Takes me about 5 minutes when I setup an 
installation to make sure the figures are reasonable (rather than the 
best they can be).

  
  
I have a lot of time now (two weeks) in this conversion and do not wish to
give up, I will see if I can learn what is needed to get the maximum
performance. I have seen much information available and this list has been

  
  a
  
  
huge resource. I really appreciate all the help.

  
  
--
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

  






Re: [SQL] working with multidimensional arrays in plpgsql

2005-01-26 Thread Josh Berkus
Sibtay,

> As you might have observed here, the actual problem is
> how to do assignment to multidimensional array locations using the
> subscript operater.

Maybe post your results, too?


-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Richard Huxton
Joel Fradkin wrote:
Thank you I will look at that info.
I did do an EXPLAIN ANALYSE on the view and could see it was doing the seq
scan on 3 fields, so I did an index for the three fields and it then chose
an index scan and ran in 27 seconds.
I also did adjust my defaults to much smaller numbers on shared buffers (per
the tidbits page recommendation like 8 meg for my memory size). I looked at
http://www.desknow.com/kb/idx/0/061/article/ which recommended doing a
vacuum verbose to determine the exact max_fsm_pages and I set the cache to
use 25% of my available memory per the recommendation on tid bits.
Note that the effective_cache_size (if I've spelt it right) just tells 
PG what your cache size is. You should set it based on what "free" tells 
you about your system's use of memory.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [SQL] datediff is there something like it?

2005-01-26 Thread Bruno Wolff III
On Tue, Jan 25, 2005 at 10:11:40 -0500,
  Joel Fradkin <[EMAIL PROTECTED]> wrote:
> Hi all working my way through our views and all is going very well.
> 
> We use datediff in MSSQL a bit and I read about the field1::date -
> field2::date to return the days numerically.
> 
> Is there any way to get months and years besides guessing days / 30 for
> months etc?

The age function will give a difference in years-months in addition to
days-hours-minutes-seconds.
Depending on what you want, it may not do exactly what you want.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Bruno Wolff III
On Tue, Jan 25, 2005 at 21:21:08 -0700,
  Dennis Sacks <[EMAIL PROTECTED]> wrote:
> 
> One of the things you'll want to do regularly is run a "vacuum analyze". 
> You can read up on this in the postgresql docs. This is essential to the 
> indexes being used properly. At a bare minimum, after you import a large 
> amount of data, you'll want to run vacuum analyze.

Note that there is no need to vacuum after inserts (only updates and deletes),
so you can just do an analyze in that case.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Dennis Sacks
Bruno Wolff III wrote:
On Tue, Jan 25, 2005 at 21:21:08 -0700,
 Dennis Sacks <[EMAIL PROTECTED]> wrote:
 

One of the things you'll want to do regularly is run a "vacuum analyze". 
You can read up on this in the postgresql docs. This is essential to the 
indexes being used properly. At a bare minimum, after you import a large 
amount of data, you'll want to run vacuum analyze.
   

Note that there is no need to vacuum after inserts (only updates and deletes),
so you can just do an analyze in that case.
 

Good point! Analyze after bulk inserts, vacuum analyze after 
updates/deletes and inserts. :)

Dennis Sacks
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] testing/predicting optimization using indexes

2005-01-26 Thread PFC

I'm quite happy with the speedup in 3, but puzzled over the slowdown in  
2.
Could you provide :
- SELECT count(*) FROM structure;
=> NRows
- SELECT avg(length(smiles)) FROM structure;
Then VACUUM FULL ANALYZE structure
Redo your timings and this time post EXPLAIN ANALYZE
Also your query returns 1313 rows, so wan you post :
EXPLAIN ANALYZE SELECT oe_matches(smiles,'c1c1CC(=O)NC') FROM  
structure;
=> time T1
EXPLAIN ANALYZE SELECT smiles FROM structure;
=> time T2

(T1-T2)/(NRows) will give you an estimate of the time spent in each  
oe_matches call.

	Also note that for postgres (a,b) > (c,d) means ((a>c) and (b>d)), which  
can be misleading, but I think that's what you wanted.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [SQL] testing/predicting optimization using indexes

2005-01-26 Thread TJ O'Donnell
I was puzzled as to why my search slowed down when I added columns.
The VACUUM did not restore the former speed,
which I had obtained before adding the columns.
So, I rebuilt the table with only the smiles column and my original
speed was again obtained (not surprising).
After I added the extra columns, it slowed down again.
Finally, I built the table with all the additional columns created
during the initial creation of the table.  The original speed was obtained!
I conclude that the addition of columns after building all the rows of
a table somehow makes the table access less efficient.  Is this generally
true?  Is there a more efficient way to add columns to a table after its
initial construction?
The secondary issue was one of using an index on the additional columns.
This greatly speeds up the overall search, by limiting the number of
rows needing to use oe_matches.  I am currently working on optimizing the
number and nature of these extra columns.  However, my initial question
still remains.  Once I find a good set of columns to use as an index,
will I then get even greater speed by defining a new data type and an
index method equivalent to my multi-column index?
Here are the data you requested.  I think this is less important now that
I know I should create all my columns from the beginning.
Thanks for the tip on how to compute average time spent in my
oe_matches functions.  This will be very useful for future optimization.
SELECT count(*) FROM structure
237597
SELECT avg(length(smiles)) FROM structure
37.6528912402092619
VACUUM FULL ANALYZE structure
(no output)
EXPLAIN ANALYZE SELECT oe_matches(smiles,'c1c1CC(=O)NC') FROM  structure
Seq Scan on structure  (cost=0.00..7573.96 rows=237597 width=41) (actual 
time=17.443..15025.974 rows=237597 loops=1)
Total runtime: 16786.542 ms
EXPLAIN ANALYZE SELECT smiles FROM structure
Seq Scan on structure  (cost=0.00..6979.97 rows=237597 width=41) (actual 
time=0.067..735.884 rows=237597 loops=1)
Total runtime: 1200.661 ms
TJ
PFC wrote:

I'm quite happy with the speedup in 3, but puzzled over the slowdown 
in  2.
Could you provide :
- SELECT count(*) FROM structure;
=> NRows
- SELECT avg(length(smiles)) FROM structure;
Then VACUUM FULL ANALYZE structure
Redo your timings and this time post EXPLAIN ANALYZE
Also your query returns 1313 rows, so wan you post :
EXPLAIN ANALYZE SELECT oe_matches(smiles,'c1c1CC(=O)NC') FROM  
structure;
=> time T1
EXPLAIN ANALYZE SELECT smiles FROM structure;
=> time T2

(T1-T2)/(NRows) will give you an estimate of the time spent in each  
oe_matches call.

Also note that for postgres (a,b) > (c,d) means ((a>c) and (b>d)), 
which  can be misleading, but I think that's what you wanted.
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [SQL] same question little different test MSSQL vrs Postgres

2005-01-26 Thread Andrew Hammond
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
We've done some pretty extensive benchmarking and load testing on a
couple of platforms including the Xeon and Opteron. You may have already
bought that Dell box, but I'll say it anyway. Xeon quad processors are a
terrible platform for postgres. Trying to use more than 4GB of memory on
a 32 bit machine is a waste of money.
If you want performance, get a quad Opteron with the same amount of
memory. I guarantee you'll see at least an order of magnitude
performance improvement and substantially more under highly concurrent
loads. If you decide to go this way, HP sells a very nice box. I also
strongly recommend you investigate SuSE instead of RedHat. Fedora core
is good technology, but SuSE offers equally good technology with better
support.
Also make sure that your SCSI HBA is actually using the 64 bit PCI bus.
There are cards out there which plug into 64 bit PCI but only actually
address 32 bits (Qlogic's QLA2340 / 2342 for example).
You make no mention of the disk subsystem you plan to use. This is most
critical part of your system. Database performance is almost always
bound by IO. Usually disk IO. Briefly, put PGDATA on the widest RAID 10
array of disks you can manage. It's not worth spending the extra money
to get 15kRPM disks for this. The size of the disks involved is pretty
much irrelevant, only the number of them matters. Put the WAL files on a
dedicated RAID 1 pair of 15kRPM disks. Put the postgres log files (or
syslog) on a seperate filesystem.
- --
Andrew Hammond416-673-4138[EMAIL PROTECTED]
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A
Joel Fradkin wrote:
| The postgres is running on Linux Fedora core 3 (production will be
redhat on
| Dell 4 proc 8 gig box).
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
iD8DBQFB+BaPgfzn5SevSpoRAgirAKDBbedScL3leQVidZjmsGmxoph8wQCgvhoW
2ZznEkxOMA3btZEBdzHd8TU=
=eg7h
-END PGP SIGNATURE-
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [SQL] testing/predicting optimization using indexes

2005-01-26 Thread PFC

Finally, I built the table with all the additional columns created
during the initial creation of the table.  The original speed was  
obtained!
	Quite strange !
	Did you vacuum full ? analyze ? Did you set a default value for the  
columns ? mmm maybe it's not the fact of adding the columns, but the  
fact of filling them with values, which screws up the vacuum if your fsm  
setting is too small ?
	Try vacuum verbose, good luck parsing the results ;)

The secondary issue was one of using an index on the additional columns.
This greatly speeds up the overall search, by limiting the number of
rows needing to use oe_matches.  I am currently working on optimizing the
number and nature of these extra columns.  However, my initial question
still remains.  Once I find a good set of columns to use as an index,
will I then get even greater speed by defining a new data type and an
index method equivalent to my multi-column index?
	You'll know that by counting the rows matched by the pre-filter (your  
columns), counting the rows actually matched, which will give you the  
number of calls to oe_match you saved, then look at the mean time for  
oe_match...

SELECT count(*) FROM structure
237597
SELECT avg(length(smiles)) FROM structure
37.6528912402092619
	Well, your rows have 26 bytes header + then about 45 bytes of TEXT, and 4  
bytes per integer column... I don't think the bytes spent in your columns  
are significant... They could have been if your smiles string had been  
shorter.

EXPLAIN ANALYZE SELECT oe_matches(smiles,'c1c1CC(=O)NC') FROM   
structure
Seq Scan on structure  (cost=0.00..7573.96 rows=237597 width=41) (actual  
time=17.443..15025.974 rows=237597 loops=1)
Total runtime: 16786.542 ms

EXPLAIN ANALYZE SELECT smiles FROM structure
Seq Scan on structure  (cost=0.00..6979.97 rows=237597 width=41) (actual  
time=0.067..735.884 rows=237597 loops=1)
Total runtime: 1200.661 ms

	OK so it takes 1.2 secs to actually read the data, and 16.8 secs to run  
oe_match... so a call is about 65 microseconds...  Note that this time  
could depend a lot on the smiles column and also on the query string !

	What you need now is to estimate the selectivity of your pre filtering  
columns, to be able to select the best possible columns : for various  
smiles queries, compute the row count which gets past the filter, and the  
row count that actually matches the oe_match. Ideally you want the first  
to be as close as possible to the second, but for your test query, as you  
return 0.5% of the table, even an inefficient pre-filter which would let  
10% of the rows through would yield a 10x speed improvement. You'd want to  
get below the 2-3% bar so that postgres will use an index scan, which will  
be even faster. Don't forget to do a sanity-check that all the rows that  
match your smiles query also match your columns filter !

	Also, using several columns (say a,b,c,d) is not optimal. Say a,b,c,d  
each contain integers between 0 and 10 with linear distribution ; then a  
query starting with 'a>=0' will automatically match more than 90% of the  
data and not use the index. You'll get a seq scan. So, either you can  
always get your first column very selective, or you'll have to use a gist  
index and integer arrays.

	If you get times that you like, then you're done ; else there may be  
another path for optimization, getting your hands dirty in the code, but  
not to the point of creating index types :

	You'll have noted that the 'c1c1CC(=O)NC' string gets reparsed for  
every processed row. You should benchmark how much time is lost in this  
parsing. You probably won't be able to do this with postgres (maybe  
matching 'c1c1CC(=O)NC' with an empty smiles string ?), so you may  
have to call the C++ functions directly.
	If this time is significant, you might want to create a datatype which  
will contain a compiled query string. You'll have to write a few C  
functions for that (dont ask me) but it should be a lot simpler than  
coding a new index type. Then you'd create a special version of oe_match  
which would take a precompiled query string. Depending on the time  
necessary to parse it, it may work.






---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[SQL] Rule problem with OLD / NEW record set

2005-01-26 Thread Ralph Graulich
Hello everyone,
given is a table with a version history kind of thing I am currently 
working on. Upon this table there is a view and the application interacts 
with the view only, updating/inserting/deleting is controlled by rules. It 
seems like the record set "OLD" gets changed when it is used in a SQL 
expression:

CREATE TABLE table1
  (
  id INTEGER NOT NULL,
  version INTEGER NOT NULL DEFAULT 0,
  vnoflag CHAR(1),
  content VARCHAR(20)
  );
INSERT INTO table1 (id, version, vnoflag, content) VALUES (1, 1, 'Y', 
'Test');

CREATE VIEW view_table1 AS SELECT * FROM table1;
-- create a rule for update
CREATE OR REPLACE RULE ru_view_table1_update
AS
ON UPDATE TO view_table1 DO INSTEAD
  (
  -- insert a new record with the old id, old version number incremented
  -- by one, versionflag set to 'Y' and the new content
  INSERT INTO table1 (id, version, vnoflag, content) VALUES (OLD.id, 
OLD.version+1, 'Y', NEW.content);
  -- update the old version and set its versionflag to 'N' as it is no
  -- longer the current record
  UPDATE table1 SET vnoflag = 'N' WHERE id = OLD.id AND version = 
OLD.version;
  );

SELECT * FROM view_table1;
 id | version | vnoflag | content
+-+-+-
  1 |   1 | Y   | Test
(1 row)
UPDATE view_table1 SET content = 'New Test' WHERE id = 1 AND vnoflag = 
'Y';
SELECT * FROM view_table1;
 id | version | vnoflag | content
+-+-+--
  1 |   1 | N   | Test
  1 |   2 | N   | New Test

It seems like the UPDATE statement updates both the old and the new 
version. If I correctly go through the statements by hand, they should 
read:

INSERT INTO table1 (id, version, vnoflag, content) VALUES (1, 1+1, 'Y', 
'New Test');
UPDATE table1 SET vnoflag = 'N' WHERE id = 1 AND version = 1;

If I change the UPDATE statement to read:
  UPDATE table1 SET vnoflag = 'N' WHERE id = OLD.id AND vno = NEW.vno-1;
it works like expected:
 id | version | vnoflag | content
+-+-+--
  1 |   2 | Y   | New Test
  1 |   1 | N   | Test
Where is my logical error? Shouldn't the first UPDATE statement suffice?
Best regards
... Ralph ...
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [SQL] Moving from Transact SQL to PL/pgSQL

2005-01-26 Thread Clint Stotesbery
For Oracle info, check out one of my recent posts:
http://archives.postgresql.org/pgsql-sql/2005-01/msg00231.php
For TSQL, well, I was involved in project where we converted an Oracle db 
(with procs, functions, triggers, etc) to PostgreSQL and MS Sql Server. 
plpgsql and plsql are close enough where it isn't too hard to convert 
between the two. TSQL and plpgsql are quite different.
-Clint

Original Message Follows
From: "Kevin Duffy" <[EMAIL PROTECTED]>
To: 
Subject: [SQL] Moving from Transact SQL to PL/pgSQL
Date: Mon, 24 Jan 2005 12:14:22 -0500
Hello:
I am starting a project using Postgres.  The requirements are very similar 
to work I have done in the past using M$-SQL.  Therefore, there are many 
Transact SQL stored procedures I need to port over to PL/pgSQL.

Where would I find documentation on PL/pgSQL, with examples?
How close to Oracle PL-SQL is Postgres?  Would a Oracle PL-SQL book cover 
the basics?

Remember be kind to the newbee.
Kevin Duffy

---(end of broadcast)---
TIP 8: explain analyze is your friend