[PERFORM] Are piped columns indexable

2008-03-12 Thread Mark Steben
Hi all,

Just upgraded to 8.2.5.  

 

Given table t with columns a, b, c, d

  And index on t using btree (a,b)

   Is this indexable:

  Select * from t where a || b = '124cab'  (or whatever)

Assume a and b are defined as char(3)

 

I have tried various op classes and so far have just gotten sequential scans

 

Thanks for your time

 

 

Mark Steben

Senior Database Administrator
@utoRevenueT 
A Dominion Enterprises Company
480 Pleasant Street
Suite B200
Lee, MA 01238
413-243-4800 Home Office 
413-243-4809 Corporate Fax

msteben mailto:[EMAIL PROTECTED]> @autorevenue.com

Visit our new website at 
 http://www.autorevenue.com/> www.autorevenue.com

 

IMPORTANT: The information contained in this e-mail message is confidential
and is intended only for the named addressee(s). If the reader of this
e-mail message is not the intended recipient (or the individual responsible
for the delivery of this e-mail message to the intended recipient), please
be advised that any re-use, dissemination, distribution or copying of this
e-mail message is prohibited.  If you have received this e-mail message in
error, please reply to the sender that you have received this e-mail message
in error and then delete it.  Thank you.

 



Re: [PERFORM] performance tools

2008-03-17 Thread Mark Steben
Toad Data Modeler from Quest Software is an E/R diagram tool that works for
us.

And - it has a freeware version.

 

Mark Steben

Senior Database Administrator
@utoRevenueT 
A Dominion Enterprises Company
480 Pleasant Street
Suite B200
Lee, MA 01238
413-243-4800 Home Office 
413-243-4809 Corporate Fax

msteben mailto:[EMAIL PROTECTED]> @autorevenue.com

Visit our new website at 
 http://www.autorevenue.com/> www.autorevenue.com

 

IMPORTANT: The information contained in this e-mail message is confidential
and is intended only for the named addressee(s). If the reader of this
e-mail message is not the intended recipient (or the individual responsible
for the delivery of this e-mail message to the intended recipient), please
be advised that any re-use, dissemination, distribution or copying of this
e-mail message is prohibited.  If you have received this e-mail message in
error, please reply to the sender that you have received this e-mail message
in error and then delete it.  Thank you.

  _  

From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of sathiya psql
Sent: Monday, March 17, 2008 3:58 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] performance tools

 

hi all,
I want this mail to be continued about summary of performance tuning
tools... or other postgres related tools..

I ll start with saying there is a tool SCHEMASPY ( i got to know about this
from this group only ), this will draw ER diagram and gives interesting
informations about our postgres database..


What are all the other opensource tools available like this for seeing
informations about our postgres database... and tools for finetuning our
postgres database

Please join with me and summarize the names and usage of the tools 

Use SchemaSpy a very easily installable and usable tool...



[PERFORM] question on TRUNCATE vs VACUUM FULL

2008-03-18 Thread Mark Steben
Hi folks,

We are running Postgres 8.2.5.

I have 3 tables, call them A, B, and C

 

Table A houses info on all emails that have ever been created for the
purpose of being delivered to our end customers.

Big table.  About 23 million rows.

  Table B, the 'holding' table is populated with Table A key information via
an after trigger whenever Table A is updated or inserted to.

  Table C, the 'work' table is populated by function D from table B.  It is
configured exactly like table B.

  PLPGSQL Function D inserts a predefined number of rows from table B to
table C. For purposes of discussion, say 500.  

  Function D, after it does its thing, then deletes the 500 rows it
processed from table B, and ALL 500 rows from table C.

 

This entire process, after a sleep period of 10 seconds, repeats itself all
day.

 

After each fifth iteration of function D, we perform a VACUUM FULL on both
tables B and C. 

   Takes less than 5 seconds.

 

In terms of transaction processing:

  Table A is processed by many transactions (some read, some update), 

  Table B is processed by

- any transaction updating or inserting to Table A via the after
trigger (insert, update)

- Function D (insert, update, delete)

  Table C is processed ONLY by function D (insert, update, delete).  Nothing
else touches it;

PG_LOCKS table verifies that that this table is totally free of any
transaction 

Between iterations of function D.

 

So my question is this:  Shouldn't VACUUM FULL clean Table C and reclaim all
its space?

It doesn't.  It usually reports the same number of pages before and after
the Vacuum.

We have to resort to TRUNCATE to clean and reclaim this table, which

Must be empty at the beginning of function D. 

 

Any insights appreciated. Thanks,

 

Mark Steben

Senior Database Administrator
@utoRevenueT 
A Dominion Enterprises Company
480 Pleasant Street
Suite B200
Lee, MA 01238
413-243-4800 Home Office 
413-243-4809 Corporate Fax

msteben mailto:[EMAIL PROTECTED]> @autorevenue.com

Visit our new website at 
 http://www.autorevenue.com/> www.autorevenue.com

 

IMPORTANT: The information contained in this e-mail message is confidential
and is intended only for the named addressee(s). If the reader of this
e-mail message is not the intended recipient (or the individual responsible
for the delivery of this e-mail message to the intended recipient), please
be advised that any re-use, dissemination, distribution or copying of this
e-mail message is prohibited.  If you have received this e-mail message in
error, please reply to the sender that you have received this e-mail message
in error and then delete it.  Thank you.

 



Re: [PERFORM] question on TRUNCATE vs VACUUM FULL

2008-03-19 Thread Mark Steben

I know what Vacuum full and truncate are supposed to do.

My confusion lies in the fact that we empty table C after
Function D finishes.  There aren't any current data or records
To touch on the table. The MVCC leftovers are all purely dead
Rows that should be deleted.  Given this, I thought that 
Vacuum full and truncate should provide exactly the same result.

I've attached my original memo to the bottom.



-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chris
Sent: Tuesday, March 18, 2008 9:11 PM
To: Mark Steben
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL


> 
> So my question is this:  Shouldn't VACUUM FULL clean Table C and reclaim 
> all its space?

You've got concepts mixed up.

TRUNCATE deletes all of the data from a particular table (and works in 
all dbms's).

http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html



VACUUM FULL is a postgres-specific thing which does work behind the 
scenes to clean up MVCC left-overs. It does not touch any current data 
or records in the table, it's purely behind the scenes work.

http://www.postgresql.org/docs/current/interactive/sql-vacuum.html


The two have completely different uses and nothing to do with each other 
what-so-ever.

-- 
Postgresql & php tutorials
http://www.designmagick.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance
[Mark Steben] 

Table A houses info on all emails that have ever been created for the
purpose of being delivered to our end customers.

Big table.  About 23 million rows.

  Table B, the 'holding' table is populated with Table A key information via
an after trigger whenever Table A is updated or inserted to.

  Table C, the 'work' table is populated by function D from table B.  It is
configured exactly like table B.

  PLPGSQL Function D inserts a predefined number of rows from table B to
table C. For purposes of discussion, say 500.  

  Function D, after it does its thing, then deletes the 500 rows it
processed from table B, and ALL 500 rows from table C.

 

This entire process, after a sleep period of 10 seconds, repeats itself all
day.

 

After each fifth iteration of function D, we perform a VACUUM FULL on both
tables B and C. 

   Takes less than 5 seconds.

 

In terms of transaction processing:

  Table A is processed by many transactions (some read, some update), 

  Table B is processed by

- any transaction updating or inserting to Table A via the after
trigger (insert, update)

- Function D (insert, update, delete)

  Table C is processed ONLY by function D (insert, update, delete).  Nothing
else touches it;

PG_LOCKS table verifies that that this table is totally free of any
transaction 

Between iterations of function D.




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] question on TRUNCATE vs VACUUM FULL

2008-03-19 Thread Mark Steben
Bill,
Thanks for your quick response.
We are at version 8.2.5 - just recently upgraded from 7.4.5.
This strategy using truncate was just implemented yesterday.
Now I will revisit the vacuum full strategy. Does seem to
Be redundant.  
Is there a procedure to begin reporting a bug?  Is there
Someone or an email address that I could bring evidence to?


Mark Steben
Senior Database Administrator
@utoRevenueT 
A Dominion Enterprises Company
480 Pleasant Street
Suite B200
Lee, MA 01238
413-243-4800 Home Office 
413-243-4809 Corporate Fax
[EMAIL PROTECTED]

Visit our new website at 
www.autorevenue.com
 
IMPORTANT: The information contained in this e-mail message is confidential
and is intended only for the named addressee(s). If the reader of this
e-mail message is not the intended recipient (or the individual responsible
for the delivery of this e-mail message to the intended recipient), please
be advised that any re-use, dissemination, distribution or copying of this
e-mail message is prohibited.  If you have received this e-mail message in
error, please reply to the sender that you have received this e-mail message
in error and then delete it.  Thank you.


-Original Message-
From: Bill Moran [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 19, 2008 9:35 AM
To: Mark Steben
Cc: 'Chris'; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL

In response to "Mark Steben" <[EMAIL PROTECTED]>:
> 
> I know what Vacuum full and truncate are supposed to do.

Then why do you keep doing the vacuum full?  Doesn't really make
sense as a maintenance strategy.

> My confusion lies in the fact that we empty table C after
> Function D finishes.  There aren't any current data or records
> To touch on the table. The MVCC leftovers are all purely dead
> Rows that should be deleted.  Given this, I thought that 
> Vacuum full and truncate should provide exactly the same result.

I would expect so as well.  You may want to mention which version
of PostgreSQL you are using, because it sounds like a bug.  If it's
an old version, you probably need to upgrade.  If it's a recent
version and you can reproduce this behaviour, you probably need
to approach this like a bug report.

> 
> I've attached my original memo to the bottom.
> 
> 
> 
> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Chris
> Sent: Tuesday, March 18, 2008 9:11 PM
> To: Mark Steben
> Cc: pgsql-performance@postgresql.org
> Subject: Re: [PERFORM] question on TRUNCATE vs VACUUM FULL
> 
> 
> > 
> > So my question is this:  Shouldn't VACUUM FULL clean Table C and reclaim

> > all its space?
> 
> You've got concepts mixed up.
> 
> TRUNCATE deletes all of the data from a particular table (and works in 
> all dbms's).
> 
> http://www.postgresql.org/docs/8.3/interactive/sql-truncate.html
> 
> 
> 
> VACUUM FULL is a postgres-specific thing which does work behind the 
> scenes to clean up MVCC left-overs. It does not touch any current data 
> or records in the table, it's purely behind the scenes work.
> 
> http://www.postgresql.org/docs/current/interactive/sql-vacuum.html
> 
> 
> The two have completely different uses and nothing to do with each other 
> what-so-ever.
> 
> -- 
> Postgresql & php tutorials
> http://www.designmagick.com/
> 
> -- 
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
> [Mark Steben] 
> 
> Table A houses info on all emails that have ever been created for the
> purpose of being delivered to our end customers.
> 
> Big table.  About 23 million rows.
> 
>   Table B, the 'holding' table is populated with Table A key information
via
> an after trigger whenever Table A is updated or inserted to.
> 
>   Table C, the 'work' table is populated by function D from table B.  It
is
> configured exactly like table B.
> 
>   PLPGSQL Function D inserts a predefined number of rows from table B to
> table C. For purposes of discussion, say 500.  
> 
>   Function D, after it does its thing, then deletes the 500 rows it
> processed from table B, and ALL 500 rows from table C.
> 
>  
> 
> This entire process, after a sleep period of 10 seconds, repeats itself
all
> day.
> 
>  
> 
> After each fifth iteration of function D, we perform a VACUUM FULL on both
> tables B and C. 
> 
>Takes less than 5 seconds.
> 
>  
> 
> In terms of transaction processing:
> 
>   Table A is processed by many transactions (some read, some update), 
> 
>   Table B is processed by
> 
> -   

[PERFORM] Performance of archive logging in a PITR restore

2009-03-16 Thread Mark Steben
First of all, I did pose this question first on the pgsql - admin mailing
list.

And I know it is not appreciated to post across multiple mailing lists so I

Apologize in advance.  I do not make it a practice to do so but, this being

A performance issue I think I should have inquired on this list first.  Rest


Assured I won't double post again.  

 

The issue is that during a restore on a remote site, (Postgres 8.2.5) 

archived logs are taking an average of 35 - 40 seconds apiece to restore.  

This is roughly the same speed that they are being archived on the
production

Site. I compress the logs when I copy them over, then uncompress them

During the restore using a cat | gzip -dc command.  I don't think 

The bottleneck is in that command - a log typically is uncompressed and
copied

In less than 2 seconds when I do this manually.  Also when I pass a log

That is already uncompressed the performance improves by only about 10
percent.

 

A log compresses (using) gzip down to between 5.5 and 6.5 MB. 

 I have attempted Increases in shared_buffers (250MB to 1500MB).

  Other relevant (I think) config parameters include:

   Maintenance_work_mem (300MB)

   Work_mem (75MB)

   Wal_buffers (48)   

   Checkpoint_segments (32)

   Autovacuum (off)

 

  

ipcs -l

 

-- Shared Memory Limits 

max number of segments = 4096

max seg size (kbytes) = 4194303

max total shared memory (kbytes) = 1073741824

min seg size (bytes) = 1

 

-- Semaphore Limits 

max number of arrays = 128

max semaphores per array = 250

max semaphores system wide = 32000

max ops per semop call = 32

semaphore max value = 32767

 

-- Messages: Limits 

max queues system wide = 16

max size of message (bytes) = 65536

default max size of queue (bytes) = 65536

 

Our database size is about 130 GB.  We use tar 

To backup the file structure. Takes roughly about

An hour to xtract the tarball before PITR log recovery

Begins.  The tarball itself 31GB compressed.

 

Again I apologize for the annoying double posting but

I am pretty much out of ideas to make this work.

 

 

 

Mark Steben│Database Administrator│ 

@utoRevenue-R- "Join the Revenue-tion"
95 Ashley Ave. West Springfield, MA., 01089 
413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)

@utoRevenue is a registered trademark and a division of Dominion Enterprises

 



[PERFORM] Performance discrepancy

2009-06-16 Thread Mark Steben



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Performance issue - 2 linux machines, identical configs, different performance

2009-06-17 Thread Mark Steben
Hi, sorry about the blank post yesterday - let's try again

 

We have two machines.  Both running Linux Redhat, both running postgres
8.2.5.

Both have nearly identical 125 GB databases.  In fact we use PITR Recovery
to 

Replicate from one to the other.  The machine we replicate to runs a query
with

About 10 inner and left joins about 5 times slower than the original machine

I run an explain on both.  Machine1 (original) planner favors hash joins
about 3 to 1

Over nested loop joins.  Machine2 (replicated) uses only nested loop joins -
no hash at all.

 

A few details - I can always provide more

 

 MACHINE1 - original:

TOTAL RAW MEMORY - 30 GB

TOTAL SHARED MEMORY (shmmax value) - 4 GB

 

 Database configs

  SHARED_BUFFERS - 1525 MB

  MAX_PREPARED_TRANSACTIONS - 5

  WORK_MEM - 300 MB

  MAINTENANCE_WORK_MEM - 512 MB 

  MAX_FSM_PAGES -- 3,000,000

  CHECKPOINT_SEGMENTS - 64

  WAL_BUFFERS -768

   EFFECTIVE_CACHE_SIZE    2 GB

  Planner method configs all turned on by default, including
enable_hashjoin

  

   MACHINE2 - we run 2 postgres instances.  Port 5433 runs continuous PITR
recoveries

   Port 5432 receives the 'latest and greatest' database when port 5433
finishes a recovery

  TOTAL RAW MEMORY - 16 GB (this is a VMWARE setup on a netapp)

  TOTAL SHARED MEMORY (shmmax value) - 4 GB

 

 Database configs - port 5432 instance

   SHARED_BUFFERS  1500 MB

   MAX_PREPARED_TRANSACTIONS - 1 (we don't run prepared transactions
here)

  WORK_MEM - 300 MB

  MAINTENANCE_WORK_MEM - 100 MB  (don't think this comes into play
in this conversation)

  MAX_FSM_PAGES -- 1,000,000

  CHECKPOINT_SEGMENTS - 32

  WAL_BUFFERS -768

  EFFECTIVE_CACHE_SIZE    2 GB

  Planner method configs all turned on by default, including
enable_hashjoin

 

Database configs - port 5433 instance

   SHARED_BUFFERS  1500 MB

   MAX_PREPARED_TRANSACTIONS - 1 (we don't run prepared transactions
here)

  WORK_MEM - 250 MB

  MAINTENANCE_WORK_MEM - 100 MB  (don't think this comes into play
in this conversation)

  MAX_FSM_PAGES -- 1,000,000

  CHECKPOINT_SEGMENTS - 32

  WAL_BUFFERS -768

  EFFECTIVE_CACHE_SIZE    2 GB

  Planner method configs all turned on by default, including
enable_hashjoin

 

   Now some size details about the 11 tables involved in the join

 All join fields are indexed unless otherwise noted and are of type
integer unless otherwise noted

 

TABLE1  -398 pages

TABLE2    5,014 pages INNER JOIN on TABLE1

TABLE3  --- 34,729 pages INNER JOIN on TABLE2 

TABLE4 1,828,000 pages INNER JOIN on TABLE2

TABLE5 1,838,000 pages INNER JOIN on TABLE4

TABLE6 -- 122,500 pages INNER JOIN on TABLE4 

TABLE7 ---  621 pages INNER JOIN on TABLE6

TABLE8  -- 4 pages INNER JOIN on TABLE7 (TABLE7 column
not indexed)

TABLE9 --- 2 pages INNER JOIN on TABLE8 (TABLE8 column
not indexed)

TABLE10 -   13 pages LEFT JOIN on TABLE6  (columns on both
tables text, neither column indexed)

TABLE11 -1,976,430 pages LEFT JOIN on TABLE5. AND explicit join on
TABLE6

   The WHERE clause filters out primary key values from TABLE1 to 1
value and a 1 month range of 

   Indexed dates from TABLE4.

 

 So, my guess is the disparity of performance (40 seconds vs 180 seconds)
has to do with MACHINE2 not

 Availing itself of hash joins which by my understanding is much faster.

 

Any help / insight appreciated.  Thank you

 

  

   

 

 

 

Mark Steben│Database Administrator│ 

@utoRevenue-R- "Join the Revenue-tion"
95 Ashley Ave. West Springfield, MA., 01089 
413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)

@utoRevenue is a registered trademark and a division of Dominion Enterprises

 



Re: [PERFORM] Performance issue - 2 linux machines, identical configs, different performance

2009-06-17 Thread Mark Steben
Yes I analyze after each replication.

Mark Steben│Database Administrator│ 

@utoRevenue-R- "Join the Revenue-tion"
95 Ashley Ave. West Springfield, MA., 01089 
413-243-4800 x1512 (Phone) │ 413-732-1824 (Fax)

@utoRevenue is a registered trademark and a division of Dominion Enterprises

-Original Message-
From: pgsql-performance-ow...@postgresql.org
[mailto:pgsql-performance-ow...@postgresql.org] On Behalf Of Dave Dutcher
Sent: Wednesday, June 17, 2009 1:39 PM
To: 'Mark Steben'; pgsql-performance@postgresql.org
Cc: 'Rich Garabedian'
Subject: Re: [PERFORM] Performance issue - 2 linux machines, identical
configs, different performance


>We have two machines.  Both running Linux Redhat, both running postgres
8.2.5.
>Both have nearly identical 125 GB databases.  In fact we use PITR Recovery
to 
>Replicate from one to the other.  

I have to ask the obvious question.  Do you regularly analyze the machine
you replicate too?


Dave



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] test send (recommended by Dave Page)

2010-01-27 Thread Mark Steben
Hi all - sorry to create additional email 'noise'

But I've been trying to post a rather long query to

The pgsql-performance user list.  Dave thought

That it might have been bounced due to the length

And suggested I send a short 'blast'

 

If this works I'll send a shortened version of my query later.

 

Thank you,

 

Mark Steben | Database Administrator 
 <http://www.autorevenue.com> @utoRevenueR - "Keeping Customers Close" 
95D Ashley Ave, West Springfield, MA 01089 
413.243.4800 x1512 (Phone) |413.732-1824 (Fax) 
 <http://www.dominionenterprises.com> @utoRevenue is a registered trademark
and a division of Dominion Enterprises 



 



[PERFORM] FUSION-IO io cards

2011-04-29 Thread Mark Steben
Hi,
Had a recent conversation with a tech from this company called FUSION-IO.
They sell
 io cards designed to replace conventional disks.  The cards can be up to 3
TB in size and apparently
are installed in closer proximity to the CPU than the disks are.  They claim
performance boosts several times better than the spinning disks.

Just wondering if anyone has had any experience with this company and these
cards.  We're currently at postgres 8.3.11.

Any insights / recommendations appreciated.  thank you,

-- 

*Mark Steben
*Database Administrator
*@utoRevenue  |  Autobase  |  AVV
The CRM division of Dominion Dealer Solutions
*95D Ashley Avenue
West Springfield,  MA  01089
t: 413.327.3045
f: 413.732.1824
w: www.autorevenue.com