FW: [PERFORM] FW: Index usage

2004-11-24 Thread BBI Edwin Punzalan

Thanks, Tim.

I tried adding an upper limit and its still the same as follows:

==
db=# explain analyze select date from chatlogs where date='11/24/04';
NOTICE:  QUERY PLAN:

Index Scan using chatlogs_date_idx on chatlogs  (cost=0.00..145.72 rows=37
width=4) (actual time=0.18..239.69 rows=10737 loops=1)
Total runtime: 246.22 msec

EXPLAIN
db=# explain analyze select date from chatlogs where date='11/23/04' and
date'11/24/04';
NOTICE:  QUERY PLAN:

Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual
time=0.44..4447.01 rows=13029 loops=1)
Total runtime: 4455.56 msec

EXPLAIN
db=# explain analyze select date from chatlogs where date='11/23/04' and
date'11/25/04';
NOTICE:  QUERY PLAN:

Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual
time=0.45..4268.00 rows=23787 loops=1)
Total runtime: 4282.81 msec
==

How come a query on the current date filter uses an index and the others
does not?  This makes indexing to speed up queries quite difficult.

-Original Message-
From: Leeuw van der, Tim [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 24, 2004 3:35 PM
To: BBI Edwin Punzalan; [EMAIL PROTECTED]
Subject: RE: [PERFORM] FW: Index usage


Well you just selected a whole lot more rows... What's the total number of
rows in the table?

In general, what I remember from reading on the list, is that when there's
no upper bound on a query like this, the planner is more likely to choose a
seq. scan than an index scan. Try to give your query an upper bound like:

select date from chatlogs where date='11/23/04' and date  '12/31/99';

select date from chatlogs where date='10/23/04' and date  '12/31/99';

This should make it easier for the planner to give a proper estimate of the
number of rows returned. If it doesn't help yet, please post 'explain
analyze' output rather than 'explain' output, for it allows much better
investigation into why the planner chooses what it chooses.

cheers,

--Tim


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of BBI Edwin
Punzalan
Sent: Wednesday, November 24, 2004 7:52 AM
To: [EMAIL PROTECTED]
Subject: [PERFORM] FW: Index usage



Hi everyone,

Can anyone please explain postgres' behavior on our index.

I did the following query tests on our database:


db=# create index chatlogs_date_idx on chatlogs (date);
CREATE
db=# explain select date from chatlogs where date='11/23/04';
NOTICE:  QUERY PLAN:

Index Scan using chatlogs_date_idx on chatlogs  (cost=0.00..144.11 rows=36
width=4)

EXPLAIN
db=# explain select date from chatlogs where date='10/23/04';
NOTICE:  QUERY PLAN:

Seq Scan on chatlogs  (cost=0.00..23938.06 rows=253442 width=4)

EXPLAIN

Date's datatype is date.  Its just odd that I just change the actual date of
search and the index is not being used anymore.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Data type to use for primary key

2004-11-24 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Greg Stark [EMAIL PROTECTED] writes:
  This could hurt if they ever reuse an old previously retired postal code,
  which isn't an entirely impossible case. As far as I know it hasn't happened
  yet though.
 
 One would suppose that the guys who are in charge of this point at
 Canada Post consider the postal code to be their primary key, and
 are no more eager to reuse one than you are to see it reused.

Well, they may eventually be forced to. For the same sort of hierarchic issue
that causes the shortage of IPv4 address space even though there's far less
than 4 billion hosts online.

But as far as I can see today the only postal codes that are being retired are
rural areas that are being developed and have blocks of codes assigned instead
of having a single broad code.

 Basically this comes down to I'm going to use some externally supplied
 primary key as my primary key.  Do I trust the upstream DBA to know what
 a primary key is?

Well there's another issue here I think. Often people see something that looks
unique and is clearly intended to be a primary key and think aha, nice
primary key but they miss a subtle distinction between what the external
primary key represents and what their data model is tracking.

The typical example is social security numbers. SSNs are a perfectly
reasonable primary key -- as long as you're tracking Social Security accounts,
not people. Most people in the US have exactly one SS account, so people often
think it looks like a primary key for people. In fact not everyone has a
Social Security account (aliens who have never worked in the US, or for that
matter people who have never been in the US) and others have had multiple
Social Security accounts (victims of identity theft).

Another example that comes to mind is the local telephone company. When I
changed my phone number they created a new account without telling me, because
their billing system's primary key for accounts is... the phone number. So all
my automated bill payments started disappearing into the black hole of the old
account and my new account went negative. I wonder what they do for customers
who buy services from them but don't have a telephone line.

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] FW: Index usage

2004-11-24 Thread gnari
From: BBI Edwin Punzalan [EMAIL PROTECTED]

 db=# explain analyze select date from chatlogs where date='11/23/04' and
 date'11/25/04';
 NOTICE:  QUERY PLAN:
 
 Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual
 time=0.45..4268.00 rows=23787 loops=1)
 Total runtime: 4282.81 msec
 ==
 
 How come a query on the current date filter uses an index and the others
 does not?  This makes indexing to speed up queries quite difficult.

have you ANALYZED the table lately ?
what version postgres are you using ?

gnari





---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] FW: Index usage

2004-11-24 Thread BBI Edwin Punzalan

Yes, the database is being vacuum-ed and analyzed on a daily basis.

Our version is 7.2.1

-Original Message-
From: gnari [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 24, 2004 4:35 PM
To: BBI Edwin Punzalan; [EMAIL PROTECTED]
Subject: Re: [PERFORM] FW: Index usage


From: BBI Edwin Punzalan [EMAIL PROTECTED]

 db=# explain analyze select date from chatlogs where date='11/23/04' 
 and date'11/25/04';
 NOTICE:  QUERY PLAN:
 
 Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual 
 time=0.45..4268.00 rows=23787 loops=1) Total runtime: 4282.81 msec
 ==
 
 How come a query on the current date filter uses an index and the 
 others does not?  This makes indexing to speed up queries quite 
 difficult.

have you ANALYZED the table lately ?
what version postgres are you using ?

gnari




---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] Unsubscribe

2004-11-24 Thread songtebo
   Unsubscribe




---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: FW: [PERFORM] FW: Index usage

2004-11-24 Thread Richard Huxton
BBI Edwin Punzalan wrote:
Thanks, Tim.
I tried adding an upper limit and its still the same as follows:
==
db=# explain analyze select date from chatlogs where date='11/24/04';
NOTICE:  QUERY PLAN:
Index Scan using chatlogs_date_idx on chatlogs  (cost=0.00..145.72 rows=37
width=4) (actual time=0.18..239.69 rows=10737 loops=1)
Total runtime: 246.22 msec
EXPLAIN
db=# explain analyze select date from chatlogs where date='11/23/04' and
date'11/24/04';
NOTICE:  QUERY PLAN:
Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual
time=0.44..4447.01 rows=13029 loops=1)
Total runtime: 4455.56 msec
We have two issues here
1. In the first example it only picks an index because it thinks it is 
going to get 37 rows, it actually gets 10737
2. It's taking 4455ms to run a seq-scan but only 246ms to run an 
index-scan over 10737 rows (and then fetch the rows too).

Questions:
1. How many rows do you have in chatlogs?
2. Is this the only problem you are experiencing, or just one from many?
3. Have you tuned any configuration settings? e.g. as suggested in:
   http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: FW: [PERFORM] FW: Index usage

2004-11-24 Thread BBI Edwin Punzalan

Hi.

1) chatlogs rows increases every now and then (its in a live environment)
and currently have 538,696 rows
2) this is the only problem we experienced.  So far, all our other indexes
are being used correctly.
3) I don't remember tuning any post-installation configuration of our
postgreSQL except setting fsync to false.

Thanks for taking a look at our problem. :D

-Original Message-
From: Richard Huxton [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, November 24, 2004 6:17 PM
To: BBI Edwin Punzalan
Cc: [EMAIL PROTECTED]
Subject: Re: FW: [PERFORM] FW: Index usage


BBI Edwin Punzalan wrote:
 Thanks, Tim.
 
 I tried adding an upper limit and its still the same as follows:
 
 ==
 db=# explain analyze select date from chatlogs where date='11/24/04';
 NOTICE:  QUERY PLAN:
 
 Index Scan using chatlogs_date_idx on chatlogs  (cost=0.00..145.72 
 rows=37
 width=4) (actual time=0.18..239.69 rows=10737 loops=1)
 Total runtime: 246.22 msec
 
 EXPLAIN
 db=# explain analyze select date from chatlogs where date='11/23/04' 
 and date'11/24/04';
 NOTICE:  QUERY PLAN:
 
 Seq Scan on chatlogs  (cost=0.00..24763.19 rows=9200 width=4) (actual 
 time=0.44..4447.01 rows=13029 loops=1) Total runtime: 4455.56 msec

We have two issues here
1. In the first example it only picks an index because it thinks it is 
going to get 37 rows, it actually gets 10737
2. It's taking 4455ms to run a seq-scan but only 246ms to run an 
index-scan over 10737 rows (and then fetch the rows too).

Questions:
1. How many rows do you have in chatlogs?
2. Is this the only problem you are experiencing, or just one from many? 3.
Have you tuned any configuration settings? e.g. as suggested in:
http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html

-- 
   Richard Huxton
   Archonet Ltd


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] Postgres vs. MySQL

2004-11-24 Thread Evilio del Rio

Hi,

I have installed the dspam filter
(http://www.nuclearelephant.com/projects/dspam) on our mail server
(RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
with a quite low traffic of 4000 messages/day. So it's a quite common
platform/environment, nothing spectacular.

First time(s) I tried the Postgres interface that was already installed
for other applications.  Whenever I begin to train and/or filter
messages throug dspam the performance is incredibly bad. First messages
are ok but soon the filter time begins to increase to about 30 seconds
or more!

...so I looked for some optimization both for the linux  kernel and the
postgres server.  Nothing has work for me. I always have the same
behavior. For isolation purposes I started using another server just to
hold the dspam database and nothing else. No matter what I do: postgres
gets slower and slower with each new message fed or filtered. 

Several strategies have failed: newest RPMs from postgresql.org,
pg_autovacuum, etc.

I finally tried the MySQL driver.

I have started using this tool right now for dspam, so I am a newcomer
in MySQL.

The result: after some preparation in configuring some parameters for
mysqld (with the QuickStart Guide from mysql.com) all works fine!

It's incredible! the same servers, the same messages, the same dspam
compilation (well each one with the corresponding
--with-storage-driver=*sql_drv). Postgres is getting worst than
30s/message and MySQL process the same in less than a second.

I can surrender the Postgres server by just corpus-feeding one single
short message to each user (it takes hours to process 300 users!).

On the other hand, MySQL only takes a few minutes to process the same
batch.

I do not want to make flame over Postgres (I have always prefered it for
its capabilities) but I am absolutely impressed by MySQL (I have seen
the light!)

Please, could anyone explain me this difference?
Is Postgres that bad?
Is MySQL that good?
Am I the only one to have observed this behavior?

TIA.

Cheers,


Evilio Jose del Rio Silvan   Centre Mediterrani d'Investigacions
[EMAIL PROTECTED]Marines i Ambientals
Microsoft sells you Windows, Linux gives you the whole house - Anonymous


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Steinar H. Gunderson
On Wed, Nov 24, 2004 at 02:14:18PM +0100, Evilio del Rio wrote:
 It's incredible! the same servers, the same messages, the same dspam
 compilation (well each one with the corresponding
 --with-storage-driver=*sql_drv). Postgres is getting worst than
 30s/message and MySQL process the same in less than a second.

AFAIK dspam is heavily optimized for MySQL and not optimized for PostgreSQL
at all; I believe there would be significant performance boosts available 
by fixing dspam.

Example queries that are slow, as well as table schemas, would probably help
a lot in tracking down the problems.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Richard Huxton
Evilio del Rio wrote:
Please, could anyone explain me this difference?
Is Postgres that bad?
Is MySQL that good?
Am I the only one to have observed this behavior?
Do you have any record of configuration, system hardware, usage 
patterns, queries executed?

If you can tell us what you tried (and why) then we might be able to 
help, otherwise there's not much information here.

--
  Richard Huxton
  Archonet Ltd
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Christian Fowler
As for performance, lots of others will probably volunteer tips and 
techniques. In my experience, properly written and tuned applications will 
show only minor speed differences. I have seen several open-source apps 
that support postgres but are not well tested on it. Query optimization 
can cause orders of magnitude performance differences. It sounds maybe 
dspam is in this bucket?


Please, could anyone explain me this difference?
Is Postgres that bad?
Is MySQL that good?
Am I the only one to have observed this behavior?
I made a little chart about these about a year ago:
http://www.tikipro.org/wiki/index.php?page=DatabaseComparison
If speed is what you need, and data integrity / safety is not, then MySQL 
may be a good choice. (Aggregate statistics tables and other such 
calculated denormalizations).

IMHO, if all you need is dpsam running *now*, then I'd say MySQL might be 
good choice. If you ever need to run a DB application where data integrity 
is mission critical, then postgres is the top of my list.

[ \ /
[ X   Christian Fowler  | spider AT viovio.com
[ / \   http://www.viovio.com | http://www.tikipro.org
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Data type to use for primary key

2004-11-24 Thread Alexandre Leclerc
On 24 Nov 2004 01:52:52 -0500, Greg Stark [EMAIL PROTECTED] wrote:
 Alexandre Leclerc [EMAIL PROTECTED] writes:
 
  Thanks for those tips. I'll print and keep them. So in my case, the
  product_code being varchar(24) is:
  4 bytes + string size (so possibly up to 24) = possible 28 bytes. I
  did the good thing using a serial. For my shorter keys (4 bytes + up
  to 6 char) I will use the natural key.
 
 Realize that space usage is really only part of the issue.

Thank you for this additionnal information. This will help out in the
futur. In my situation this is a good thing to have integer key where
I decided to have them. Event if I was obliged to add UNIQUE
constraints to some other columns. I think they call this candidate
key and it's still 3NF (whatever; but only if my db is correctly
organised)... I try to be logical and efficient for good performance.
But in the end, the time (the db will get bigger) and good EXPLAIN
ANALYSE commands will help fine tuning later! This will give me more
experience at that point.

 Actually I see one interesting exception to my policy in my current database
 schema. And I don't think I would do this one differently given the choice
 either. The primary key of the postal code table is the postal code. (postal
 codes are up here in the great white north like zip codes down there.)

(I do understand this one, living in the province of Quebec. ;) And
the great white north is still not arrived; end november! - Still, not
very exceptionnal.)

Regards.

-- 
Alexandre Leclerc

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Joshua D. Drake
Evilio del Rio wrote:
Hi,
I have installed the dspam filter
(http://www.nuclearelephant.com/projects/dspam) on our mail server
(RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
with a quite low traffic of 4000 messages/day. So it's a quite common
platform/environment, nothing spectacular.
 

The problem is definately dspam. We have been through their code.
The new version is much, much better than the older one but I am sure
there is more work to be done.
The first version we installed suffered from a well known problem:
It would use smallint/bigint but would not cast or quote the
where clauses and thus PostgreSQL would never use the indexes.
It was also missing several indexes on appropriate columns.
We offered some advice and we know that some of it was taken but
we don't know which.
Sincerely,
Joshua D. Drake

--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
begin:vcard
fn:Joshua Drake
n:Drake;Joshua
org:Command Prompt, Inc.
adr:;;PO Box 215 ;Cascade Locks;OR;97014;US
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0334
x-mozilla-html:FALSE
url:http://www.commandprompt.com
version:2.1
end:vcard


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] memcached and PostgreSQL

2004-11-24 Thread Jim C. Nasby
If instead of a select you do a select for update I think this would be
transaction safe. Nothing would be able to modify the data in the
database between when you do the SELECT and when you commit. If the
transaction fails the value in memcached will be correct.

Also, it's not clear if you're doing an update here or not... If you're
doing an update then this wouldn't work. You'd want to do your update,
then re-insert the value into memcached outside of the update
transaction.

On Tue, Nov 23, 2004 at 02:20:34PM -0800, Sean Chittenden wrote:
 My point was that there are two failure cases --- one where the cache 
 is
 slightly out of date compared to the db server --- these are cases 
 where
 the cache update is slightly before/after the commit.
 
 I was thinking about this and ways to minimize this even further.  Have 
 memcache clients add data and have a policy to have the database only 
 delete data.  This sets the database up as the bottleneck again, but 
 then you have a degree of transactionality that couldn't be previously 
 achieved with the database issuing replace commands.  For example:
 
 1) client checks the cache for data and gets a cache lookup failure
 2) client beings transaction
 3) client SELECTs data from the database
 4) client adds the key to the cache
 5) client commits transaction
 
 This assumes that the client won't rollback or have a transaction 
 failure.  Again, in 50M transactions, I doubt one of them would fail 
 (sure, it's possible, but that's a symptom of bigger problems: 
 memcached isn't an RDBMS).
 
 The update case being:
 
 1) client begins transaction
 2) client updates data
 3) database deletes record from memcache
 4) client commits transaction
 5) client adds data to memcache
 
 The second is
 where the cache update happens and the commit later fails, or the 
 commit
 happens and the cache update never happens.
 
 Having pgmemcache delete, not replace data addresses this second issue. 
  -sc
 
 -- 
 Sean Chittenden
 
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly
 

-- 
Jim C. Nasby, Database Consultant   [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: Where do you want to go today?
Linux: Where do you want to go tomorrow?
FreeBSD: Are you guys coming, or what?

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Steinar H. Gunderson
On Wed, Nov 24, 2004 at 09:57:52AM -0500, Christian Fowler wrote:
 As for performance, lots of others will probably volunteer tips and 
 techniques. In my experience, properly written and tuned applications will 
 show only minor speed differences. I have seen several open-source apps 
 that support postgres but are not well tested on it. Query optimization 
 can cause orders of magnitude performance differences.

Definitely. My favourite is Request Tracker (we use 2.x, although 3.x is the
latest version), which used something like 5-600 queries (all seqscans since
the database schema only had an ordinary index on the varchar fields in
question, and the queries were automatically searching on LOWER(field) to
emulate MySQL's case-insensitivity on varchar fields) for _every_ page shown.
Needless to say, the web interface was dog slow -- some index manipulation
and a few bugfixes (they had some kind of cache layer which would eliminate
98% of the queries, but for some reason was broken for non-MySQL databases)
later, and we were down to 3-4 index scans, a few orders of magnitude faster.
:-)

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [PERFORMANCE] Big number of schemas (3500) into a single database

2004-11-24 Thread Jaime Casanova
 --- Tom Lane [EMAIL PROTECTED] escribió: 
 Constantin Teodorescu [EMAIL PROTECTED] writes:
  If I will choose to keep a mirror of every
  workstation database in a
  separate schema in the central database that mean
  that I will have 3500 different schemas.
 
  Is there any limit or any barrier that could stop
  this kind of approach or make things go slower?
 
 Would you need to put them all into search_path at
 once?
 
 I'm not sure what the scaling issues might be for
 long search_paths, but I wouldn't be surprised if 
 it's bad.  But as long as you don't do that,
 I don't believe there will be any problems.
 

if i do a select with fully qualified table names it
will search in the search_path or it will go directly
to the schema?

Just for know.

regards,
Jaime Casanova

_
Do You Yahoo!?
Información de Estados Unidos y América Latina, en Yahoo! Noticias.
Visítanos en http://noticias.espanol.yahoo.com

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[PERFORM] Group By index usage

2004-11-24 Thread sdfasdfas sdfasdfs
I have a table with this index:

 create index ARTISTS_NAME on ARTISTS (
 lower(AR_NAME)
 );

Te index is over a colum with this definition:

 AR_NAME  VARCHAR(256) null,

I want to optimize this query:

 select * from artists where lower(ar_name) like
lower('a%') order by lower(ar_name) limit 20;

I think the planner should use the index i have. But
the result of the explain command is:

 explain analyze select * from artists where
lower(ar_name) like lower('a%') order by
lower(ar_name) limit 20;

  
QUERY PLAN   

 Limit  (cost=20420.09..20420.14 rows=20 width=360)
(actual time=2094.13..2094.19 rows=20 loops=1)
   -  Sort  (cost=20420.09..20433.52 rows=5374
width=360) (actual time=2094.13..2094.16 rows=21
loops=1)
 Sort Key: lower((ar_name)::text)
 -  Index Scan using artists_name on artists 
(cost=0.00..19567.09 rows=5374 width=360) (actual
time=0.11..1391.97 rows=59047 loops=1)
   Index Cond: ((lower((ar_name)::text) =
'a'::text) AND (lower((ar_name)::text)  'b'::text))
   Filter: (lower((ar_name)::text) ~~
'a%'::text)
 Total runtime: 2098.62 msec
(7 rows)

The ORDER BY clause is not using the index!. I don't
know why.

I have the locale configured to C, and the index works
well with the like operator. 

¿Could you help me? I am really lost. 



__
Renovamos el Correo Yahoo!: ¡100 MB GRATIS!
Nuevos servicios, más seguridad
http://correo.yahoo.es

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Group By index usage

2004-11-24 Thread Tom Lane
sdfasdfas sdfasdfs [EMAIL PROTECTED] writes:
 I have a table with this index:
  create index ARTISTS_NAME on ARTISTS (
  lower(AR_NAME)
  );

 Te index is over a colum with this definition:

  AR_NAME  VARCHAR(256) null,

 I want to optimize this query:

  select * from artists where lower(ar_name) like
 lower('a%') order by lower(ar_name) limit 20;

 I think the planner should use the index i have.

Update to 7.4, or declare the column as TEXT instead of VARCHAR.
Older versions aren't very bright about situations involving
implicit coercions.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] FW: Index usage

2004-11-24 Thread gnari
From: BBI Edwin Punzalan [EMAIL PROTECTED]
 
 Yes, the database is being vacuum-ed and analyzed on a daily basis.
 

then you should consider increating the statistics on the date column,
as the estimates were a bit off in the plan

 Our version is 7.2.1

upgrade time ?

gnari



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Slow execution time when querying view with WHERE clause

2004-11-24 Thread Mike Mascari
Tom Lane wrote:
Um ... doh ... analyze.c about line 1550:
/* We can only compute valid stats if we found some non-null values. */
if (nonnull_cnt  0)
   ...
There's a bit of an epistemological issue here: if we didn't actually
find any nonnull values in our sample, is it legitimate to assume that
the column is entirely null?  On the other hand, if we find only 3 in
our sample we will happily assume the column contains only 3, so I
dunno why we are discriminating against null.  This seems like a case
that just hasn't come up before.
Will this discriminatory policy toward null end for 8.0?
Mike Mascari

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Postgres vs. MySQL

2004-11-24 Thread Steinar H. Gunderson
On Wed, Nov 24, 2004 at 02:14:18PM +0100, Evilio del Rio wrote:
 I have installed the dspam filter
 (http://www.nuclearelephant.com/projects/dspam) on our mail server
 (RedHat 7.3 Linux with sendmail 8.13 and procmail). I have ~300 users
 with a quite low traffic of 4000 messages/day. So it's a quite common
 platform/environment, nothing spectacular.

We just had a case just like this on #postgresql. The (somewhat surprising)
solution was increasing the statistics target on the token column to
something like 200, which makes the planner choose an index scan instead of a
sequential scan.

For the people who did not follow the case: The culprit is a query like

  SELECT * FROM table WHERE token IN ('346369873476346', '4376376034', ...)

(token is a numeric(20,0)) With one entry in the IN (), the cost of an index
scan was estimated to 4.77; with ten entries, it was about 48, but with 574
entries the estimated cost was 513565 (!!), making the planner prefer an
index scan to 574 consecutive index scans. Upping the statistics target made
the planner estimate the cost to about ~4000, and thus select the index scan,
which was two orders of magnitude faster.

BTW, this case was with PostgreSQL 7.4.6, not 7.3 as the poster here is
reporting.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Hardware purchase question

2004-11-24 Thread Bo Stewart
We currently are utilizing postgresql on 2 servers with the following 
configuration:

2 - 2.4 Ghz Xeon processors
4GB ram
4 36gb 1rpm scsi drives configured for raid 10
We started out with one server and as we became IO bound we added the 
second.  We are currently considering purchasing another identical server 
to go along with these.  In addition to this we are also considering a scsi 
attached storage device in the 10 - 14 drive range configured for raid 10 
in place of the onboard 4 drives we currently have.   Daily about 30% of 
our data gets updated with about 2% new data.  Our query load is about 60% 
reads and 40% writes currently.  My question is what type of performance 
gains can I expect on average from swapping from 4 disk raid 10 to 14 disk 
raid 10?  Could I expect to see 40 - 50% better throughput.

The servers listed above are the dell 2650's which have perc 3 
controllers.  I have seen on this list where they are know for not 
performing well.  So any suggestions for an attached scsi device would be 
greatly appreciated.  Also, any thoughts on fibre channel storage devices?

Thank You,
Bo Stewart

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] [pgsql-hackers-win32] scalability issues on win32

2004-11-24 Thread Reini Urban
Merlin Moncure schrieb:
Following is the promised writeup in performance related issues
comparing win32 with linux x86 and linux x86-64.  Unfortunately, the 64
bit portion of the test is not yet completed and won't be for a bit.
However there are some telling things about the win32/linux comparison.
If you are considering deploying postgres in a win32 environment read
on...
 
First a description of the client app:
Our company develops an ERP/CRM written in cobol which we are porting to
run on PostgreSQL.  Part of this porting effort was development of an
ISAM 'driver' for our app to allow it to store/fetch data from the
database in place of a traditional file system, which is complete.

For those of you unfamiliar with COBOL/ISAM, applications written with
it have a 'one record at a time' mentality, such the application tends
to spam the server with queries of the select * from t where k = k1
variety.  Our driver creates stored procedures as necessary and uses
ExecParams wherever possible to cut down on server CPU time, which is a
precious resource in our case.  Over time we plan to gradually redeploy
our application logic to make better use of the sql server's server side
power.  Our application is very rarely i/o bound because we will make
sure our server has enough memory so that the data will be rarely, if
ever, *not* run from the cache.
A good benchmark of our application performance is the time it takes to
read the entire bill of materials for a product.  This is a recursive
read of about 2500 records in the typical case (2408 in the test case).
I always knew that COBOL ultimativly looses, but it's always refreshing 
to get confirmation from time to time :)

Test platform:
Pentium 4 3.06 GHz/HT
10k SATA Raptor
1Gb memory
Windows XP Pro SP2/Redhat Fedora 3 (64 bit results coming soon)
BOM traversal for product * (1 user): 
win32: runtime: 3.34 sec  avg cpu load: 60%
redhat: runtime: 3.46 sec  avg cpu load: 20%
Where did you get the win32 avg cpu load number from? AFAIK there's no 
getloadavg() for windows. At least I tried hard to find one, because I 
want to add a comparable figure to cygwin core. emacs, coreutils, make 
and others would need desperately need it, not to speak of servers and 
real-time apps.
Did you read it from taskman, or did you come up with your self-written 
solution? In taskman there's afaik no comparable figure. But there 
should be some perfmon api, which would do the trick.

Overview:
  http://www.wilsonmar.com/1perfmon.htm#TaskManager
The load average (LA) is the average number of processes (the sum of 
the run queue length and the number of jobs currently running) that are 
ready to run, but are waiting for access to a busy CPU.

And thanks for the overview!
--
Reini Urban
http://xarch.tu-graz.ac.at/home/rurban/
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] HELP speed up my Postgres

2004-11-24 Thread JM
Hi ALL,

Ive been using postgres for 3 years and now we are having problems with 
its 
performance.

Here are some givens..

We have 260 subscription tables per Database. 
We have 2 databases.

Our main client has given us 250,000 mobile numbers to 
deactivate.

--
We we are experiencing
 91,000 mobile numbers to deactive it took a week to finish for 
1 DB only 
the second DB is still in the process of deactivating

Algorithm to deactivate:
we loaded all subscription tables names into a table
we loaded all mobile numbers to deactivate into a table

SQL:
update SUBSCRIPTIONTABLE set ACTIVEFLAG='Y' where mobile_num in 
(select 
mobile_num from LOADED_MOBILE_NUMBERS)

the script was made is C

COFIG FILE:
# This is ARA nmimain

tcpip_socket = true
max_connections = 150
superuser_reserved_connections = 2

port = 5433
shared_buffers = 45600
sort_mem = 4
max_locks_per_transaction=128

#fsync = true
#wal_sync_method = fsync

#
#   Locale settings
#
# (initialized by initdb -- may be changed)
LC_MESSAGES = 'en_US.UTF-8'
LC_MONETARY = 'en_US.UTF-8'
LC_NUMERIC = 'en_US.UTF-8'
LC_TIME = 'en_US.UTF-8'


.. DB is being vaccumed every week
my box is running on a DUAL Xeon, 15K RPM with 2 G Mem.

that box is running 2 instances of PG DB.



TIA,






---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] HELP speed up my Postgres

2004-11-24 Thread Vishal Kashyap @ [SaiHertz]
Dear JM ,



 Ive been using postgres for 3 years and now we are having problems 
 with its

PostgrSQL version please
-- 
With Best Regards,
Vishal Kashyap.
Lead Software Developer,
http://saihertz.com,
http://vishalkashyap.tk

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] HELP speed up my Postgres

2004-11-24 Thread JM
PG Version 7.3.4

On Thursday 25 November 2004 14:12, Vishal Kashyap @ [SaiHertz] wrote:
 Dear JM ,

  Ive been using postgres for 3 years and now we are having
  problems with its

 PostgrSQL version please


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] HELP speed up my Postgres

2004-11-24 Thread Tom Lane
JM [EMAIL PROTECTED] writes:
 PG Version 7.3.4

Avoid the IN (subselect) construct then.  7.4 is the first release
that can optimize that in any real sense.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] HELP speed up my Postgres

2004-11-24 Thread Klint Gore
On Thu, 25 Nov 2004 14:00:32 +0800, JM [EMAIL PROTECTED] wrote:
   update SUBSCRIPTIONTABLE set ACTIVEFLAG='Y' where mobile_num in 
 (select 
 mobile_num from LOADED_MOBILE_NUMBERS)

does loaded_mobile_numbers have a primary key or index on mobile_num?
same for subscriptiontable?
have you analyzed both tables?
is mobile_num the same type in both tables?

how does this query compare?
   update SUBSCRIPTIONTABLE set ACTIVEFLAG='Y' 
   from loaded_mobile_numbers
   where subscriptiontable.mobile_num = LOADED_MOBILE_NUMBERS.mobile_num

klint.

+---+-+
: Klint Gore: Non rhyming:
: EMail   : [EMAIL PROTECTED]   :  slang - the:
: Snail   : A.B.R.I.:  possibilities  :
: Mail  University of New England   :  are useless   :
:   Armidale NSW 2351 Australia : L.J.J.  :
: Fax : +61 2 6772 5376 : :
+---+-+

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] HELP speed up my Postgres

2004-11-24 Thread Christopher Kings-Lynne
		update SUBSCRIPTIONTABLE set ACTIVEFLAG='Y' where mobile_num in (select 
mobile_num from LOADED_MOBILE_NUMBERS)
Change to:
update SUBSCRIPTIONTABLE set ACTIVEFLAG='Y' where exists (select 1 from 
LOADED_MOBILE_NUMBERS lmn where 
lmn.mobile_num=SUBSCRIPTIONTABLE.mobile_num);

That should run a lot faster.
Make sure you have indexes on both mobile_num columns.
Chris
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html