Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-05 Thread Relaxin
It is forward only in the ODBC driver.

"Neil Conway" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> On Fri, 2003-09-05 at 14:18, Relaxin wrote:
> > Expect that the Declare/Fetch only creates a forwardonly cursor, you can
go
> > backwards thru the result set.
>
> No, DECLARE can create scrollable cursors, read the ref page again. This
> functionality is much improved in PostgreSQL 7.4, though.
>
> -Neil
>
>
>
> ---(end of broadcast)---
> TIP 9: the planner will ignore your desire to choose an index scan if your
>   joining column's datatypes do not match
>



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Serious issues with CPU usage

2003-09-05 Thread Tom Lane
<[EMAIL PROTECTED]> writes:
> i'm having _serious_ issues of postgres hogging up the CPU over time. A graph
> showing this can be seen at http://andri.estpak.ee/cpu0.png .

You really haven't shown us anything that would explain that graph ...
repeated UPDATEs will slow down a little until you vacuum, but not
by the ratio you seem to be indicating.  At least not if they're
indexscans.  If you've also got sequential-scan queries, and you're
doing many zillion updates between vacuums, the answer is to vacuum
more often.  A decent rule of thumb is to vacuum whenever you've updated
more than about 10% of the rows in a table since your last vacuum.

> A VACUUM FULL is a remedy to this problem, but a simple VACUUM isn't. 

I find that odd; maybe there's something else going on here.  But you've
not given enough details to speculate.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Serious issues with CPU usage

2003-09-05 Thread andris
Hi,

i'm having _serious_ issues of postgres hogging up the CPU over time. A graph
showing this can be seen at http://andri.estpak.ee/cpu0.png .

The database is running on Redhat 9 (stock 2.4.20-8 kernel), on a reiserfs
partition (~8% usage - no problem there), and this problem has been with
PostgreSQL 7.3.2 (both package provided by Redhat and self-rebuilt package)
and 7.3.4 (i used the 7.3.4 SRPM available at postgres ftp site).

A VACUUM FULL is a remedy to this problem, but a simple VACUUM isn't. 

This can be reproduced, I think, by a simple UPDATE command:

database=# EXPLAIN ANALYZE UPDATE table SET random_int_field = 1, last_updated
= NOW() WHERE primary_key = 3772;
   QUERY PLAN
-
 Index Scan using table_pkey on table  (cost=0.00..6.81 rows=1 width=83)
(actual time=0.09..0.10 rows=1 loops=1)
   Index Cond: (primary_key = 3772)
 Total runtime: 0.37 msec

When I repeat this command using simple , I can see the "Total
runtime" time grow ever so slightly - creeping from 0.37 to 0.38, then 0.39
etc.   Would probably get higher if I had the patience. :)

The table "table" used in this example has 2721 rows, so size isn't an issue here.

Any comments or suggestions are welcome. If more information is needed, let me
know and I'll post the needed details.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Tom Lane
Neil Conway <[EMAIL PROTECTED]> writes:
> On Fri, 2003-09-05 at 06:07, Richard Huxton wrote:
>> You should find plenty of discussion of why in the archives, but the short 
>> reason is that PG's type structure is quite flexible which means it can't 
>> afford to make too many assumptions.

> Well, it's definitely a bug in PG, it's "quite flexible" type structure
> notwithstanding.

Let's say it's something we'd really like to fix ;-) ... and will, as
soon as we can figure out a cure that's not worse than the disease.
Dorking around with the semantics of numeric expressions has proven
to be a risky business.  See, eg, the thread starting here:
http://archives.postgresql.org/pgsql-hackers/2002-11/msg00468.php

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-05 Thread Neil Conway
On Fri, 2003-09-05 at 14:18, Relaxin wrote:
> Expect that the Declare/Fetch only creates a forwardonly cursor, you can go
> backwards thru the result set.

No, DECLARE can create scrollable cursors, read the ref page again. This
functionality is much improved in PostgreSQL 7.4, though.

-Neil



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-05 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> We do have:
>   #geqo_random_seed = -1  # -1 = use variable seed

> that lets you force a specific random seed for testing purposes.  I
> wonder if that could be extended to control VACUUM radomization too. 
> Right now, it just controls GEQO and in fact gets reset on every
> optimizer run.

Actually, just the other day I was thinking we should take that out.
Since there is only one random number generator in the C library,
GEQO is messing with everyone else's state every time it decides to do
an srandom().  And there is certainly no need to do an explicit srandom
with a "random" seed every time through the optimizer, which is the
code's default behavior at the moment.  That just decreases the
randomness AFAICS, compared to letting the established sequence run.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-05 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > We do have:
> > #geqo_random_seed = -1  # -1 = use variable seed
> 
> > that lets you force a specific random seed for testing purposes.  I
> > wonder if that could be extended to control VACUUM radomization too. 
> > Right now, it just controls GEQO and in fact gets reset on every
> > optimizer run.
> 
> Actually, just the other day I was thinking we should take that out.
> Since there is only one random number generator in the C library,
> GEQO is messing with everyone else's state every time it decides to do
> an srandom().  And there is certainly no need to do an explicit srandom
> with a "random" seed every time through the optimizer, which is the
> code's default behavior at the moment.  That just decreases the
> randomness AFAICS, compared to letting the established sequence run.

Agreed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-05 Thread Bruce Momjian
Mary Edie Meredith wrote:
> I certainly don't claim that it is appropriate to force customers into a
> full analysis, particularly if random sampling versus a full scan of the
> data reveals little to no performance differences in the plans.  Being
> able to sample accurately is _very nice for large tables.
> 
> For our testing purposes, however, consistent results are extremely
> important. We have observed that small difference in one plan for one of
> 22 queries can cause a difference in the DBT-3 results.  If this
> happens, a small change in performance runs between two Linux kernels
> may appear to be due to the kernels, when in fact it is due to the plan
> change.  
> 
> We know that the plans are _exactly the same if the data in the
> pg_statistics table is the same from run to run (all other things being
> equal).  So what we need to have is identical optimizer costs
> (pg_statistics) for the same table data for each.
> 
> I feel certain that the pg_statistics table will be identical from run
> to run if analyze looks at every row.   Thus our hope to find a way to
> get that.


Actually, if you are usig GEQO (many tables in a join) the optimizer
itself will randomly try plans --- even worse than random statistics.

We do have:

#geqo_random_seed = -1  # -1 = use variable seed

that lets you force a specific random seed for testing purposes.  I
wonder if that could be extended to control VACUUM radomization too. 
Right now, it just controls GEQO and in fact gets reset on every
optimizer run.

I wonder if you could just poke a srandom(10) in
src/backend/command/analyze.c.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-05 Thread Relaxin
Expect that the Declare/Fetch only creates a forwardonly cursor, you can go
backwards thru the result set.

""Patrick Hatcher"" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
>
> Relaxin,
> I can't remember during this thread if you said you were using ODBC or
not.
> If you are, then your problem is with the ODBC driver.  You will need to
> check the Declare/Fetch box or you will definitely bring back the entire
> recordset.  For small a small recordset this is not a problem, but the
> larger the recordset the slower the data is return to the client.  I
played
> around with the cache size on the driver and found a value between 100 to
> 200 provided good results.
>
> HTH
> Patrick Hatcher
>
>
>
>
>
> "Relaxin" <[EMAIL PROTECTED]>
> Sent by:   To:
[EMAIL PROTECTED]
> [EMAIL PROTECTED]   cc:
> gresql.org Subject: Re:
[PERFORM] SELECT's take a long time compared to other DBMS
>
>
> 09/04/2003 07:13 PM
>
>
>
>
>
> Thank you Christopher.
>
> > Change fsync to true (you want your data to survive, right?) and
> > increase shared buffers to something that represents ~10% of your
> > system memory, in blocks of 8K.
>
> I turned it off just in the hope that things would run faster.
>
> > None of this is likely to substantially change the result of that one
> > query, however, and it seems quite likely that it is because
> > PostgreSQL is honestly returning the whole result set of ~100K rows at
> > once, whereas the other DBMSes are probably using cursors to return
> > only the few rows of the result that you actually looked at.
>
> Finally, someone who will actually assume/admit that it is returning the
> entire result set to the client.
> Where as other DBMS manage the records at the server.
>
> I hope PG could fix/enhance this issue.
>
> There are several issues that's stopping our company from going with PG
> (with paid support, if available), but this seems to big the one at the
top
> of the list.
>
> The next one is the handling of BLOBS.  PG handles them like no other
> system
> I have ever come across.
>
> After that is a native Windows port, but we would deal cygwin (for a very
> little while) if these other issues were handled.
>
> Thanks
>
>
>
>
>
> "Christopher Browne" <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]
> > A long time ago, in a galaxy far, far away, "Relaxin" <[EMAIL PROTECTED]>
> wrote:
> > >> Have you changed any of the settings yet in postgresql.conf,
> > >> specifically the shared_buffers setting?
> > >
> > > fsync = false
> > > tcpip_socket = true
> > > shared_buffers = 128
> >
> > Change fsync to true (you want your data to survive, right?) and
> > increase shared buffers to something that represents ~10% of your
> > system memory, in blocks of 8K.
> >
> > So, if you have 512MB of RAM, then the total blocks is 65536, and it
> > would likely be reasonable to increase shared_buffers to 1/10 of that,
> > or about 6500.
> >
> > What is the value of effective_cache_size?  That should probably be
> > increased a whole lot, too.  If you are mainly just running the
> > database on your system, then it would be reasonable to set it to most
> > of memory, or
> >   (* 1/2 (/ (* 512 1024 1024) 8192))
> > 32768.
> >
> > None of this is likely to substantially change the result of that one
> > query, however, and it seems quite likely that it is because
> > PostgreSQL is honestly returning the whole result set of ~100K rows at
> > once, whereas the other DBMSes are probably using cursors to return
> > only the few rows of the result that you actually looked at.
> > --
> > "cbbrowne","@","cbbrowne.com"
> > http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
> > Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
> > kiss, a last cigarette, or any other form of last request."
> > 
>
>
>
> ---(end of broadcast)---
> TIP 8: explain analyze is your friend
>
>
>
>
> ---(end of broadcast)---
> TIP 3: if posting/reading through Usenet, please send an appropriate
>   subscribe-nomail command to [EMAIL PROTECTED] so that your
>   message can get through to the mailing list cleanly
>



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Neil Conway
On Fri, 2003-09-05 at 06:07, Richard Huxton wrote:
> PG's parser will assume an explicit number is an int4 - if you need an int8 
> etc you'll need to cast it, yes.

Or enclose the integer literal in single quotes.

> You should find plenty of discussion of why in the archives, but the short 
> reason is that PG's type structure is quite flexible which means it can't 
> afford to make too many assumptions.

Well, it's definitely a bug in PG, it's "quite flexible" type structure
notwithstanding.

-Neil



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Richard Huxton
On Friday 05 September 2003 19:20, Neil Conway wrote:
> On Fri, 2003-09-05 at 06:07, Richard Huxton wrote:
> > PG's parser will assume an explicit number is an int4 - if you need an
> > int8 etc you'll need to cast it, yes.
>
> Or enclose the integer literal in single quotes.
>
> > You should find plenty of discussion of why in the archives, but the
> > short reason is that PG's type structure is quite flexible which means it
> > can't afford to make too many assumptions.
>
> Well, it's definitely a bug in PG, it's "quite flexible" type structure
> notwithstanding.

It certainly catches out a lot of people. I'd guess it's in the top three 
issues in the general/sql lists. I'd guess part of the problem is it's so 
silent. In some ways it would be better to issue a NOTICE every time a 
typecast is forced in a comparison - irritating as that would be.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic

2003-09-05 Thread Mary Edie Meredith
I certainly don't claim that it is appropriate to force customers into a
full analysis, particularly if random sampling versus a full scan of the
data reveals little to no performance differences in the plans.  Being
able to sample accurately is _very nice for large tables.

For our testing purposes, however, consistent results are extremely
important. We have observed that small difference in one plan for one of
22 queries can cause a difference in the DBT-3 results.  If this
happens, a small change in performance runs between two Linux kernels
may appear to be due to the kernels, when in fact it is due to the plan
change.  

We know that the plans are _exactly the same if the data in the
pg_statistics table is the same from run to run (all other things being
equal).  So what we need to have is identical optimizer costs
(pg_statistics) for the same table data for each.

I feel certain that the pg_statistics table will be identical from run
to run if analyze looks at every row.   Thus our hope to find a way to
get that.

We did runs over night.  We can confirm that VACUUM FULL ANALYZE does
not produce the same pg_statistics run to run.  With the default (10)
default_statistics_target the plans are also different.

We ran additional tests with default_statistics_target set to 1000 (the
max I believe).  The plans are the same over the different runs, but the
pg_statistics table has different cost values.  The performance results
of the runs are consistent (we would expect this with the same plans). 
The resulting performance metrics are similar to the best plans we see
using the default histogram size (good news).

However, we worry that one day the cost will change enough for whatever
reason to cause a plan change, especially for a larger database scale
factor (database size/row size). 

I know we appear to be an isolated case, but customers also do testing
and may have the same consistency issues we have.  I can also imagine
cases where customers want to guarantee that plans stay the same
(between replicated sites, for example).  If two developers are
analyzing changes to the optimizer, don't you want the costs used for
testing on their two systems to be identical for comparison purposes?

Anyway, IMHO I believe that an option for an ANALYZE FULL ("sampling"
all rows) would be valuable.   Any other ideas for how to force this
without code change are very welcome.  

Thanks for your info!



On Thu, 2003-09-04 at 16:16, Tom Lane wrote:
> Mary Edie Meredith <[EMAIL PROTECTED]> writes:
> > Stephan Szabo kindly responded to our earlier queries suggesting we look
> > at default_statistics_target and ALTER TABLE ALTER COLUMN SET
> > STATISTICS.  
> 
> > These determine the number of bins in the histogram for a given column. 
> > But for a large number of rows (for example 6 million) the maximum value
> > (1000) does not guarantee that ANALYZE will do a full scan of the table.
> > We do not see a way to guarantee the same statistics run to run without
> > forcing ANALYZE to examine every row of every table.  
> 
> Do you actually still have a problem with the plans changing when the
> stats target is above 100 or so?  I think the notion of "force ANALYZE
> to do a full scan" is inherently wrongheaded ... it certainly would not
> produce numbers that have anything to do with ordinary practice.
> 
> If you have data statistics that are so bizarre that the planner still
> gets things wrong with a target of 1000, then I'd like to know more
> about why.
> 
>   regards, tom lane
> 
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
-- 
Mary Edie Meredith <[EMAIL PROTECTED]>
Open Source Development Lab


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] [GENERAL] how to get accurate values in pg_statistic (continued)

2003-09-05 Thread Tom Lane
Mary Edie Meredith <[EMAIL PROTECTED]> writes:
> For our testing purposes, however, consistent results are extremely
> important. We have observed that small difference in one plan for one of
> 22 queries can cause a difference in the DBT-3 results.  If this
> happens, a small change in performance runs between two Linux kernels
> may appear to be due to the kernels, when in fact it is due to the plan
> change.  

Fair enough.  If you are trying to force exactly repeatable results,
why don't you just "set seed = 0" before you ANALYZE?  There's only
one random-number generator, so that should force ANALYZE to make the
same random sampling every time.

Also, it'd be a good idea to ANALYZE the needed tables by name,
explicitly, to ensure that they are analyzed in a known order
rather than whatever order ANALYZE happens to find them in pg_class.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Performance problems on a fairly big table with two

2003-09-05 Thread Richard Huxton
On Friday 05 September 2003 16:36, Rasmus Aveskogh wrote:
> Richard,
>
> Thanks a lot! You were right - the query parser "misunderstood"
> now() - '1 day'::interval and only used one of the indexes (as I already
> noticed).
>
> Actually all I had to do was to cast the result like this:
>
> (now() - '1 day'::interval)::date
>
> 75s is not between 10ms and 200ms.
>
> Thanks again!

Ah - good. You also want to be careful with differences between timestamp 
with/without time zone etc.
-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Ron Johnson
On Fri, 2003-09-05 at 09:39, Jonathan Bartlett wrote:
> > I think I have found out why.. I have a where clause on a ID field but it
> > seems like I need to cast this integer to the same integer as the field is
> > defined in the table, else it will do a tablescan.
> 
> Yes, this is correct
> 
> > Is this assumtion correct? And if it is, do I then need to change all my
> > sql's to cast the where clause where I just have a number (eg where field
> > = 1) to force the planner to use index scan instead of seq scan?
> 
> Someone correct me if I'm wrong, but I believe numbers are int4's, so they
> need to be cast if your column is not an int4.

You mean "constant" scalars?  Yes, constants scalars are interpreted
as int4.

-- 
-
Ron Johnson, Jr. [EMAIL PROTECTED]
Jefferson, LA USA

"Millions of Chinese speak Chinese, and it's not hereditary..."
Dr. Dean Edell


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance problems on a fairly big table with two

2003-09-05 Thread Rasmus Aveskogh

Richard,

Thanks a lot! You were right - the query parser "misunderstood"
now() - '1 day'::interval and only used one of the indexes (as I already
noticed).

Actually all I had to do was to cast the result like this:

(now() - '1 day'::interval)::date

75s is not between 10ms and 200ms.

Thanks again!

-ra


> On Thursday 04 September 2003 23:53, Rasmus Aveskogh wrote:
>> Hi,
>>
>> I have a table that looks like this:
>>
>>   DATA   ID   TIME
>>
>> |--||--|
>>
>> The table holds app. 14M rows now and grows by app. 350k rows a day.
>>
>> The ID-column holds about 1500 unique values (integer).
>> The TIME-columns is of type timestamp without timezone.
>>
>> I have one index (b-tree) on the ID-column and one index (b-tree) on the
>> time-column.
>>
>> My queries most often look like this:
>>
>> SELECT DATA FROM  WHERE ID = 1 AND TIME > now() - '1
>> day'::interval;
> [snip]
>> I tried applying a multicolumn index on ID and TIME, but that one won't
>> even be used (after ANALYZE).
>
> The problem is likely to be that the parser isn't spotting that now()-'1
> day'
> is constant. Try an explicit time and see if the index is used. If so, you
> can write a wrapper function for your expression (mark it STABLE so the
> planner knows it won't change during the statement).
>
> Alternatively, you can do the calculation in the application and use an
> explicit time.
>
> HTH
> --
>   Richard Huxton
>   Archonet Ltd
>


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] SELECT's take a long time compared to other DBMS

2003-09-05 Thread Patrick Hatcher

Relaxin,
I can't remember during this thread if you said you were using ODBC or not.
If you are, then your problem is with the ODBC driver.  You will need to
check the Declare/Fetch box or you will definitely bring back the entire
recordset.  For small a small recordset this is not a problem, but the
larger the recordset the slower the data is return to the client.  I played
around with the cache size on the driver and found a value between 100 to
200 provided good results.

HTH
Patrick Hatcher




   
  
"Relaxin" <[EMAIL PROTECTED]>  

Sent by:   To: [EMAIL PROTECTED]   
   
[EMAIL PROTECTED]   cc:
   
gresql.org Subject: Re: [PERFORM] 
SELECT's take a long time compared to other DBMS   
   
  
   
  
09/04/2003 07:13 PM
  
   
  




Thank you Christopher.

> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.

I turned it off just in the hope that things would run faster.

> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.

Finally, someone who will actually assume/admit that it is returning the
entire result set to the client.
Where as other DBMS manage the records at the server.

I hope PG could fix/enhance this issue.

There are several issues that's stopping our company from going with PG
(with paid support, if available), but this seems to big the one at the top
of the list.

The next one is the handling of BLOBS.  PG handles them like no other
system
I have ever come across.

After that is a native Windows port, but we would deal cygwin (for a very
little while) if these other issues were handled.

Thanks





"Christopher Browne" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> A long time ago, in a galaxy far, far away, "Relaxin" <[EMAIL PROTECTED]>
wrote:
> >> Have you changed any of the settings yet in postgresql.conf,
> >> specifically the shared_buffers setting?
> >
> > fsync = false
> > tcpip_socket = true
> > shared_buffers = 128
>
> Change fsync to true (you want your data to survive, right?) and
> increase shared buffers to something that represents ~10% of your
> system memory, in blocks of 8K.
>
> So, if you have 512MB of RAM, then the total blocks is 65536, and it
> would likely be reasonable to increase shared_buffers to 1/10 of that,
> or about 6500.
>
> What is the value of effective_cache_size?  That should probably be
> increased a whole lot, too.  If you are mainly just running the
> database on your system, then it would be reasonable to set it to most
> of memory, or
>   (* 1/2 (/ (* 512 1024 1024) 8192))
> 32768.
>
> None of this is likely to substantially change the result of that one
> query, however, and it seems quite likely that it is because
> PostgreSQL is honestly returning the whole result set of ~100K rows at
> once, whereas the other DBMSes are probably using cursors to return
> only the few rows of the result that you actually looked at.
> --
> "cbbrowne","@","cbbrowne.com"
> http://www3.sympatico.ca/cbbrowne/linuxdistributions.html
> Rules of  the Evil Overlord #14. "The  hero is not entitled  to a last
> kiss, a last cigarette, or any other form of last request."
> 



---(end of broadcast)---
TIP 8: explain analyze is your friend




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Jonathan Bartlett
> I think I have found out why.. I have a where clause on a ID field but it
> seems like I need to cast this integer to the same integer as the field is
> defined in the table, else it will do a tablescan.

Yes, this is correct

> Is this assumtion correct? And if it is, do I then need to change all my
> sql's to cast the where clause where I just have a number (eg where field
> = 1) to force the planner to use index scan instead of seq scan?

Someone correct me if I'm wrong, but I believe numbers are int4's, so they
need to be cast if your column is not an int4.

Jon



>
>
> BTJ
>
> > I am trying to tune my database and I discovered one select that does a
> > seq scan on a table but I can't see why... All the join fields are indexed
> > and I am returning just one record, so no sort is done.
> > Does it just pick seq scan for the heck of it or is it a reason?
> >
> > Regards,
> >
> > BTJ
> >
> > ---
> > Bjørn T Johansen (BSc,MNIF)
> > Executive Manager
> > [EMAIL PROTECTED]  Havleik Consulting
> > Phone : +47 67 54 15 17 Conradisvei 4
> > Fax : +47 67 54 13 91   N-1338 Sandvika
> > Cellular : +47 926 93 298   http://www.havleik.no
> > ---
> > "The stickers on the side of the box said "Supported Platforms: Windows
> > 98, Windows NT 4.0,
> > Windows 2000 or better", so clearly Linux was a supported platform."
> > ---
> >
> >
> > ---(end of broadcast)---
> > TIP 2: you can get off all lists at once with the unregister command
> > (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
> >
>
>
>
> ---(end of broadcast)---
> TIP 5: Have you checked our extensive FAQ?
>
>http://www.postgresql.org/docs/faqs/FAQ.html
>


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Bjørn T Johansen
On Fri, 2003-09-05 at 12:07, Richard Huxton wrote:
> On Friday 05 September 2003 09:47, Bjorn T Johansen wrote:
> > I think I have found out why.. I have a where clause on a ID field but it
> > seems like I need to cast this integer to the same integer as the field is
> > defined in the table, else it will do a tablescan.
> >
> > Is this assumtion correct? And if it is, do I then need to change all my
> > sql's to cast the where clause where I just have a number (eg where field
> > = 1) to force the planner to use index scan instead of seq scan?
> 
> PG's parser will assume an explicit number is an int4 - if you need an int8 
> etc you'll need to cast it, yes.
> You should find plenty of discussion of why in the archives, but the short 
> reason is that PG's type structure is quite flexible which means it can't 
> afford to make too many assumptions.

Oki, I am using both int2 and int8 as well, so that explains it...
Thanks!


BTJ




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Richard Huxton
On Friday 05 September 2003 09:47, Bjorn T Johansen wrote:
> I think I have found out why.. I have a where clause on a ID field but it
> seems like I need to cast this integer to the same integer as the field is
> defined in the table, else it will do a tablescan.
>
> Is this assumtion correct? And if it is, do I then need to change all my
> sql's to cast the where clause where I just have a number (eg where field
> = 1) to force the planner to use index scan instead of seq scan?

PG's parser will assume an explicit number is an int4 - if you need an int8 
etc you'll need to cast it, yes.
You should find plenty of discussion of why in the archives, but the short 
reason is that PG's type structure is quite flexible which means it can't 
afford to make too many assumptions.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [GENERAL] Seq scan of table?

2003-09-05 Thread Bjorn T Johansen
I think I have found out why.. I have a where clause on a ID field but it
seems like I need to cast this integer to the same integer as the field is
defined in the table, else it will do a tablescan.

Is this assumtion correct? And if it is, do I then need to change all my
sql's to cast the where clause where I just have a number (eg where field
= 1) to force the planner to use index scan instead of seq scan?


BTJ

> I am trying to tune my database and I discovered one select that does a
> seq scan on a table but I can't see why... All the join fields are indexed
> and I am returning just one record, so no sort is done.
> Does it just pick seq scan for the heck of it or is it a reason?
>
> Regards,
>
> BTJ
>
> ---
> Bjørn T Johansen (BSc,MNIF)
> Executive Manager
> [EMAIL PROTECTED]  Havleik Consulting
> Phone : +47 67 54 15 17 Conradisvei 4
> Fax : +47 67 54 13 91   N-1338 Sandvika
> Cellular : +47 926 93 298   http://www.havleik.no
> ---
> "The stickers on the side of the box said "Supported Platforms: Windows
> 98, Windows NT 4.0,
> Windows 2000 or better", so clearly Linux was a supported platform."
> ---
>
>
> ---(end of broadcast)---
> TIP 2: you can get off all lists at once with the unregister command
> (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
>



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance problems on a fairly big table with two key columns.

2003-09-05 Thread Richard Huxton
On Thursday 04 September 2003 23:53, Rasmus Aveskogh wrote:
> Hi,
>
> I have a table that looks like this:
>
>   DATA   ID   TIME
>
> |--||--|
>
> The table holds app. 14M rows now and grows by app. 350k rows a day.
>
> The ID-column holds about 1500 unique values (integer).
> The TIME-columns is of type timestamp without timezone.
>
> I have one index (b-tree) on the ID-column and one index (b-tree) on the
> time-column.
>
> My queries most often look like this:
>
> SELECT DATA FROM  WHERE ID = 1 AND TIME > now() - '1 day'::interval;
[snip]
> I tried applying a multicolumn index on ID and TIME, but that one won't
> even be used (after ANALYZE).

The problem is likely to be that the parser isn't spotting that now()-'1 day' 
is constant. Try an explicit time and see if the index is used. If so, you 
can write a wrapper function for your expression (mark it STABLE so the 
planner knows it won't change during the statement).

Alternatively, you can do the calculation in the application and use an 
explicit time.

HTH
-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster