[PERFORM] Autocommit

2005-02-14 Thread Hasnul Fadhly bin Hasan
Hi,
I am just wondering, by default, autocommit is enabled for every client 
connection.  The documentations states that we have to use BEGIN
and  END or COMMIT so to increase performance by not using autocommit. 
My question is, when we use the BEGIN and END statements, is autocommit 
unset/disabled automatically or we have to disable/unset it manually?

Hasnul

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] Autocommit

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 04:01:20PM +0800, Hasnul Fadhly bin Hasan wrote:
 
 I am just wondering, by default, autocommit is enabled for every client 
 connection.  The documentations states that we have to use BEGIN
 and  END or COMMIT so to increase performance by not using autocommit. 
 My question is, when we use the BEGIN and END statements, is autocommit 
 unset/disabled automatically or we have to disable/unset it manually?

What version of PostgreSQL is your server running and what client
software are you using?  PostgreSQL 7.3 had a server-side autocommit
setting, but it caused problems with some clients so 7.4 got rid
of it and left autocommit up to the client.  How to enable or disable
client-side autocommit depends on the client software, but if you're
able to execute a BEGIN (or START TRANSACTION) statement then you
should be inside a transaction until you execute COMMIT (or END)
or ROLLBACK.  That is, unless your client intercepts these statements
and does whatever it wants

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Autocommit

2005-02-14 Thread Michael Fuhr
On Mon, Feb 14, 2005 at 04:58:31PM +0800, Hasnul Fadhly bin Hasan wrote:

 Thanks for the reply.  I am using postgres 7.4.5 client.  There's one 
 that is using 7.4.1 client.  I'm not sure if there would be any difference.
 When i use psql and check the status of autocommit, it is set to 
 enable.  I'm not sure if libpq and psql uses the same defaults.

As far as I can tell, libpq doesn't have an autocommit setting --
it just sends statements on behalf of the application.  Clients
that allow the user to disable autocommit presumably do so by
implicitly sending BEGIN statements to start new transactions.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Benchmark

2005-02-14 Thread Harald Fuchs
In article [EMAIL PROTECTED],
Greg Stark [EMAIL PROTECTED] writes:

 Christopher Browne [EMAIL PROTECTED] writes:
 After all, the cost of a computer system to run the transactions is
 likely to be comprised of some combination of software licenses and
 hardware costs.  Even if the software is free, the hardware isn't.

 And labour costs.

Except that working with PostgreSQL is fun, not labour :-)


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] estimated rows vs. actual rows

2005-02-14 Thread Christopher Browne
After takin a swig o' Arrakan spice grog, [EMAIL PROTECTED] (Jaime Casanova) 
belched out:
 On Sun, 13 Feb 2005 13:41:09 -0800, Josh Berkus josh@agliodbs.com wrote:
 Jaime,
 
  Why is this query using a seq scan rather than a index scan?
 
 Because it thinks a seq scan will be faster.
 
 I will suggest him to probe with seq scans disabled.

 But, IMHO, if the table has 143902 and it thinks will retrieve 2610
 (almost 1.81% of the total). it won't be faster with an index?

If the 2610 rows are scattered widely enough, it may be cheaper to do
a seq scan.

After all, with a seq scan, you read each block of the table's pages
exactly once.

With an index scan, you read index pages _and_ table pages, and may do
and redo some of the pages.

It sounds as though it's worth forcing the matter and trying it both
ways and comparing them.  Don't be surprised if the seq scan is in
fact faster...
-- 
select 'cbbrowne' || '@' || 'gmail.com';
http://cbbrowne.com/info/emacs.html
When aiming for the common denominator, be prepared for the occasional
division by zero.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] String matching

2005-02-14 Thread PFC
normally you shouldn't have to do anything, it should just work :
select field from table where field like 'abc%'
CREATE INDEX ... ON table( field );
that's all
	If it does not use the index, I saw on the mailing list that the locale  
could be an issue.

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] String matching

2005-02-14 Thread Ivan Voras
Stephan Szabo wrote:
You can also create an index using a typename_pattern_ops operator
class which should be usable even with other collations.
Could you give me an example for this, or point me to the relevant 
documentation?

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] String matching

2005-02-14 Thread Stephan Szabo

On Mon, 14 Feb 2005, Ivan Voras wrote:

 Stephan Szabo wrote:

  You can also create an index using a typename_pattern_ops operator
  class which should be usable even with other collations.

 Could you give me an example for this, or point me to the relevant
 documentation?

Basically, you could have something like:

create table test_table(a text);
create index test_index on test_table(a text_pattern_ops);

--

http://www.postgresql.org/docs/8.0/interactive/indexes-opclass.html


---(end of broadcast)---
TIP 8: explain analyze is your friend


[PERFORM] seq scan cache vs. index cache smackdown

2005-02-14 Thread Mark Aufflick
Hi All,
I have boiled my situation down to the following simple case: (postgres 
version 7.3)

* Query 1 is doing a sequential scan over a table (courtesy of field 
ILIKE 'foo%') and index joins to a few others
* Query 2 is doing a functional index scan over the same table 
(lower(field) LIKE 'foo%') and index joins to a few others
* neither query has an order by clause
* for the purpose of testing, both queries are designed to return the 
same result set

Obviously Q2 is faster than Q1, but if I ever run them both at the same 
time (lets say I run two of Q1 and one of Q2 at the same time) then Q2 
consistently returns WORSE times than Q1 (explain analyze confirms that 
it is using the index).

My assumption is that the sequential scan is blowing the index from any 
cache it might live in, and simultaneously stealing all the disk IO 
that is needed to access the index on disk (the table has 200,000 
rows).

If I simplify the case to not do the index joins (ie. operate on the 
one table only) the situation is not as dramatic, but similar.

My thoughts are:
1) kill the sequential scan - but unfortunately I don't have direct 
control over that code
2) change the way the server allocates/prioritizes different caches - i 
don't know enough about how postgres caches work to do this (if it's 
possible)
3) try it on postgres 7.4 - possible, but migrating the system to 7.4 
in production will be hard because the above code that I am not 
responsible for has a lot of (slightly wacky) implicit date casts
4) ask the fine people on the mailing list for other suggestions!
--
Mark Aufflick
  e  [EMAIL PROTECTED]
  w  www.pumptheory.com (work)
  w  mark.aufflick.com (personal)
  p  +61 438 700 647
  f  +61 2 9436 4737


iBurst Wireless Broadband from $34.95/month   www.platformnetworks.net
Forward undetected SPAM to:   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] VACCUM FULL ANALYZE PROBLEM

2005-02-14 Thread Michael Ryan S. Puncia








Hi,



I have 3 tables in the database with 80G of data, one of
them is almost 40G and the remaining 2 tables has 20G each.

We use this database mainly for query and updating is done only quarterly
and the database perform well. My problem

is after updating and then run VACCUM FULL ANALYZE vacuuming the tables
takes days to complete. I hope someone

can help me solve my problem.



Thanks












Re: [PERFORM] VACCUM FULL ANALYZE PROBLEM

2005-02-14 Thread Michael Ryan S. Puncia










But I need to do full vacuum because I
deleted some of the fields that are not use anymore and I also add another
fields. Is there

another way to speed up full
vacuum?













From: Iain
[mailto:[EMAIL PROTECTED] 
Sent: Tuesday, February 15, 2005
9:52 AM
To: Michael Ryan S. Puncia; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] VACCUM FULL
ANALYZE PROBLEM







Hi,











just make sure that your freespace map is
big enough and then do a vacuum analyse without the full option. 











I can imagine that database performance
might not be as good as it would be after a vacuum full, though I expect that
it wouldn't make much difference.











regards





Iain 







- Original Message - 





From: Michael Ryan
S. Puncia 





To:
pgsql-performance@postgresql.org 





Sent:
Tuesday, February 15, 2005 10:34 AM





Subject:
[PERFORM] VACCUM FULL ANALYZE PROBLEM









Hi,



I have 3 tables in the database with 80G of data, one of
them is almost 40G and the remaining 2 tables has 20G each.

We use this database mainly for query and updating is done only
quarterly and the database perform well. My problem

is after updating and then run VACCUM FULL ANALYZE vacuuming the
tables takes days to complete. I hope someone

can help me solve my problem.



Thanks









__ NOD32 1.998 (20050212) Information __

This message was checked by NOD32 Antivirus System.
http://www.nod32.com








Re: [PERFORM] VACCUM FULL ANALYZE PROBLEM

2005-02-14 Thread Rod Taylor
On Tue, 2005-02-15 at 09:34 +0800, Michael Ryan S. Puncia wrote:
 Hi,
 
  
 
 I have 3 tables in the database with 80G of data, one of them is
 almost 40G and the remaining 2 tables has 20G each.
 
 We use this database mainly for query and updating is done only
 quarterly and the database perform well. My problem
 
 is after updating and then run VACCUM FULL ANALYZE  vacuuming the
 tables takes days to complete. I hope someone

I suspect the VACUUM FULL is the painful part. Try running CLUSTER on
the table or changing a column type (in 8.0) instead.
-- 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] seq scan cache vs. index cache smackdown

2005-02-14 Thread Rod Taylor
 My concern is that this kind of testing has very little relevance to the 
 real world of multiuser processing where contention for the cache becomes an 
 issue.  It may be that, at least in the current situation, postgres is 
 giving too much weight to seq scans based on single user, straight line 

To be fair, a large index scan can easily throw the buffers out of whack
as well. An index scan on 0.1% of a table with 1 billion tuples will
have a similar impact to buffers as a sequential scan of a table with 1
million tuples.

Any solution fixing buffers should probably not take into consideration
the method being performed (do you really want to skip caching a
sequential scan of a 2 tuple table because it didn't use an index) but
the volume of data involved as compared to the size of the cache.

I've often wondered if a single 1GB toasted tuple could wipe out the
buffers. I would suppose that toast doesn't bypass them.
-- 
Rod Taylor [EMAIL PROTECTED]


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] VACCUM FULL ANALYZE PROBLEM

2005-02-14 Thread Tom Lane
Iain [EMAIL PROTECTED] writes:
 another  way  to speed up full vacuum?

 Hmmm... a full vacuum may help to re-organize the structure of modified 
 tables, but whether this is significant or not is another matter.

Actually, VACUUM FULL is designed to work nicely for the situation where
a table has say 10% wasted space and you want the wasted space all
compressed out.  When there is a lot of wasted space, so that nearly all
the rows have to be moved to complete the compaction operation, VACUUM
FULL is not a very good choice.  And it simply moves rows around, it
doesn't modify the rows internally; so it does nothing at all to reclaim
space that would have been freed up by DROP COLUMN operations.

CLUSTER is actually a better bet if you want to repack a table that's
suffered a lot of updates or deletions.  In PG 8.0 you might also
consider one of the rewriting variants of ALTER TABLE.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] seq scan cache vs. index cache smackdown

2005-02-14 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (Mark Aufflick) wrote:
 Hi All,

 I have boiled my situation down to the following simple case:
 (postgres version 7.3)

 * Query 1 is doing a sequential scan over a table (courtesy of field
 ILIKE 'foo%') and index joins to a few others
 * Query 2 is doing a functional index scan over the same table
 (lower(field) LIKE 'foo%') and index joins to a few others
 * neither query has an order by clause
 * for the purpose of testing, both queries are designed to return the
 same result set

 Obviously Q2 is faster than Q1, but if I ever run them both at the
 same time (lets say I run two of Q1 and one of Q2 at the same time)
 then Q2 consistently returns WORSE times than Q1 (explain analyze
 confirms that it is using the index).

 My assumption is that the sequential scan is blowing the index from
 any cache it might live in, and simultaneously stealing all the disk
 IO that is needed to access the index on disk (the table has 200,000
 rows).

There's something to be said for that...

 If I simplify the case to not do the index joins (ie. operate on the
 one table only) the situation is not as dramatic, but similar.

 My thoughts are:

 1) kill the sequential scan - but unfortunately I don't have direct
 control over that code

This is a good choice, if plausible...

 2) change the way the server allocates/prioritizes different caches -
 i don't know enough about how postgres caches work to do this (if it's
 possible)

That's what the 8.0 cache changes did...  Patent claim issues are
leading to some changes to the prioritization, which is liable to
change 8.0.something and 8.1.

 3) try it on postgres 7.4 - possible, but migrating the system to 7.4
 in production will be hard because the above code that I am not
 responsible for has a lot of (slightly wacky) implicit date casts

Moving to 7.4 wouldn't materially change the situation; you'd have to
go all the way to version 8.
-- 
(format nil [EMAIL PROTECTED] cbbrowne gmail.com)
http://linuxdatabases.info/~cbbrowne/postgresql.html
Rules of the Evil Overlord #32. I will not fly into a rage and kill a
messenger who brings me bad news  just to illustrate how evil I really
am. Good messengers are hard to come by.
http://www.eviloverlord.com/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] seq scan cache vs. index cache smackdown

2005-02-14 Thread Iain
Hi Rod,
 
 Any solution fixing buffers should probably not take into consideration
 the method being performed (do you really want to skip caching a
 sequential scan of a 2 tuple table because it didn't use an index) but
 the volume of data involved as compared to the size of the cache.

Yes, in fact indexes aren't so different to tables really in that regard.

It sounds like version 8 may help out anyway.

regards
Iain

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings