from:"Josh Berkus"

Re: [PERFORM] Delete query takes exorbitant amount of time

2005-03-25 Thread Josh Berkus

Karim,

> I guess I was wondering if there is other general tuning advice for such
> large table indexes such as increasing statistics, etc.

Well, your index use problem is being explained by Tom, Stephan and Simon; 
basically your FKed data types are incompatible for index use purposes so the 
system *can't* use an index while loading.

If you're going with the drop/load/recreate option, then I'd suggest 
increasing work_mem for the duration.  Hmmm ... or maintenance_work_mem?  
What gets used for FK checks?  Simon?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Josh Berkus

Jack,

> I am thinking about how to continuously monitor the performance of a
> PostgreSQL 8 database. I am interested in two things: (1) the growth of
> tables with TOAST and indexes;

This is queryable from the system tables, if you don't mind an approximate.  

> and (2) the respond time breakdown for a 
> query.

The what?  You mean EXPLAIN ANALYZE?

> In Chapters 23 and 24 of the big manual, I found enough materials to
> teach me how to do the 1st job. But I have difficulty with the 2nd one.
> I found some script for Oracle
> (http://www.ixora.com.au/scripts/waits.htm).

Life's too short for reading Oracle docs.   Can you just explain, in 
step-by-step detail, what you want?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Script for getting a table of reponse-time breakdown

2005-03-25 Thread Josh Berkus

Jack,

> This script can be used to focus tuning attention on the most important
> issues. It reports a breakdown of total foreground response time into
> four major categories: CPU usage, disk I/O, resource waits, and routine
> latencies. These categories are broken down further into sub-categories,
> and the component wait events are shown.

This would be very nice.  And very, very hard to build.

No, we don't have anything similar.  You can, of course, use profiling tools.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] How to improve db performance with $7K?

2005-03-26 Thread Josh Berkus

Bjoern, Josh, Steve,

> Get 12 or 16 x 74GB Western Digital Raptor S-ATA drives, one 3ware
> 9500S-12 or two 3ware 9500S-8 raid controllers with a battery backup
> unit (in case of power loss the controller saves unflushed data), a
> decent tyan board for the existing dual xeon with 2 pci-x slots and a
> matching 3U case for 12 drives (12 drives internal).

Based on both my testing and feedback from one of the WD Raptor engineers, 
Raptors are still only optimal for 90% read applications.  This makes them a 
great buy for web applications (which are 95% read usually) but a bad choice 
for OLTP applicaitons which sounds more like what Steve's describing.  For 
those, it would be better to get 6 quality SCSI drives than 12 Raptors.

The reason for this is that SATA still doesn't do bi-directional traffic very 
well (simultaneous read and write) and OSes and controllers simply haven't 
caught up with the drive spec and features.  WD hopes that in a year they 
will be able to offer a Raptor that performs all operations as well as a 10K 
SCSI drive, for 25% less ... but that's in the next generation of drives, 
controllers and drivers.

Steve, can we clarify that you are not currently having any performance 
issues, you're just worried about failure?   Recommendations should be based 
on whether improving applicaiton speed is a requirement ...

> Here in Germany chassis by Chenbro are quite popular, a matching one for
> your needs would be the chenbro RM312 or RM414
> (http://61.30.15.60/product/product_preview.php?pid=90 and
> http://61.30.15.60/product/product_preview.php?pid=95 respectively).

The Chenbros are nice, but kinda pricey ($800) if Steve doesn't need the 
machine to be rackable.

If your primary goal is redundancy, you may wish to consider the possibility 
of building a brand-new machine for $7k (you can do a lot of machine for 
$7000 if it doesn't have to be rackable) and re-configuring the old machine 
and using it as a replication or PITR backup.   This would allow you to 
configure the new machine with only a moderate amount of hardware redundancy 
while still having 100% confidence in staying running.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Query Optimizer Failure / Possible Bug

2005-03-28 Thread Josh Berkus

Hannes,

> The query and the corresponding EXPLAIN is at
>
> http://hannes.imos.net/query.txt

The problem is that you're using a complex corellated sub-select in the SELECT 
clause:

SELECT
  d.delivery_id,
  da.article_no,
  da.amount,
  (
  SELECT
COUNT(*)
  FROM
serials s
INNER JOIN rma_ticket_serials rts ON (
s.serial_id = rts.serial_id
)
  WHERE
s.article_no  = da.article_no AND
s.delivery_id = d.delivery_id AND
rts.replace   = FALSE
  ) AS replaced_serials

This means that the planner pretty much has to iterate over the subquery, 
running it once for each row in the result set.   If you want the optimizer 
to use a JOIN structure instead, put the subselect in the FROM clause.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] How to speed up word count in tsearch2?

2005-03-31 Thread Josh Berkus

Yudie,

> (It is the 2nd posting, maybe the 1st one didn't goes thru)
> I've tested several keyword count from 2 millions record book
> description table that indexed with tseach2 indexing.
> The result is always slow for first query attempt.

Yes, this is because your tsearch2 index is getting pushed out of RAM.   When 
the index is cached it's very, very fast but takes a long time to get loaded 
from disk.

You need to look at what else is using RAM on that machine.  And maybe buy 
more.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] [sfpug] DATA directory on network attached storage

2005-04-08 Thread Josh Berkus

Jeff,

>  Specifically is the performance of
> gigE good enough to allow postgres to perform under load with an NFS
> mounted DATA dir?  Are there other problems I haven't thought about?  Any
> input would be greatly appreciated.

The big problem with NFS-mounted data is that NFS is designed to be a lossy 
protocol; that is, sometimes bits get dropped and you just re-request the 
file.  This isn't a great idea with databases.

If we were talking SAN, then I don't see any reason why your plan wouldn't 
work.  However, what type of failure exactly are you guarding against?  How 
likely is a machine failure if its hard drives are external?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

[PERFORM] Functionscan estimates

2005-04-08 Thread Josh Berkus

Folks,

I'm wondering if it might be useful to be able to add estimated selectivity to 
a function definition for purposes of query estimation.  Currently function 
scans automatically return a flat default 1000 estimated rows.   It seems 
like the DBA ought to be able to ALTER FUNCTION and give it a row estimate 
for planning purposes.   

Thoughts?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Functionscan estimates

2005-04-08 Thread Josh Berkus

Alvaro, Michael,

> > About a month ago I mentioned that I'd find that useful.  In a
> > followup, Christopher Kings-Lynne brought up the idea of a GUC
> > variable that could give hints about the expected row count.
>
> That seems pretty limited ... what happens if the query contains more
> that one SRF?

Yeah, I'd see that as a pretty bad idea too.  I don't want to tell the planner 
how many rows I expect "all functions" to return, I want to tell it how many 
*one particular* function will return.

> Maybe issuing some sort of special call to the function (say, with
> some boolean in the call info struct) on which it returns planning data;
> thus the planner can call the function itself.  The hard part would be
> figuring out how to do it without breaking backwards compatibility with
> functions that don't know how to handle that.  (And how to do it in
> plpgsql).

Or in pl/perl, or pl/python, or plsh  doesn't sound feasable.   

My solution would be a lot simpler, since we could simply populate 
pg_proc.proestrows with "1000" by default if not changed by the DBA.  In an 
even better world, we could tie it to a table, saying that, for example, 
proestrows = my_table*0.02.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Functionscan estimates

2005-04-10 Thread Josh Berkus

People:

(HACKERS: Please read this entire thread at 
http://archives.postgresql.org/pgsql-performance/2005-04/msg00179.php 
Sorry for crossing this over.)

> > The larger point is that writing an estimator for an SRF is frequently a
> > task about as difficult as writing the SRF itself
>
> True, although I think this doesn't necessarily kill the idea. If
> writing an estimator for a given SRF is too difficult, the user is no
> worse off than they are today. Hopefully there would be a fairly large
> class of SRFs for which writing an estimator would be relatively simple,
> and result in improved planner behavior.

For that matter, even supplying an estimate constant would be a vast 
improvement over current functionality.  I would suggest, in fact, that we 
allow the use of either a constant number, or an estimator function, in that 
column.  Among other things, this would allow implementing the constant 
number right now and the use of an estimating function later, in case we can 
do the one but not the other for 8.1.

To be more sophisticated about the estimator function, it could take a subset 
of the main functions arguments, based on $1 numbering, for example:
CREATE FUNCTION some_func ( INT, TEXT, TEXT, INT, INT ) ...
ALTER FUNCTION some_func WITH ESTIMATOR some_func_est( $4, $5 )

This would make writing estimators which would work for several functions 
easier.   Estimators would be a special type of functions which would take 
any params and RETURN ESTIMATOR, which would be implicitly castable from some 
general numeric type (like INT or FLOAT).

> > I don't foresee a whole lot of use of an estimator hook designed as
> > proposed here.  In particular, if the API is such that we can only
> > use the estimator when all the function arguments are plan-time
> > constants, it's not going to be very helpful.

Actually, 95% of the time I use SRFs they are accepting constants and not row 
references.  And I use a lot of SRFs.

>
> Yes :( One approach might be to break the function's domain into pieces
> and have the estimator function calculate the estimated result set size
> for each piece. So, given a trivial function like:
>
> foo(int):
> if $1 < 10 then produce 100 rows
> else produce 1 rows
>
> If the planner has encoded the distribution of input tuples to the
> function as a histogram, it could invoke the SRF's estimator function
> for the boundary values of each histogram bucket, and use that to get an
> idea of the function's likely result set size at runtime.
>
> And yes, the idea as sketched is totally unworkable :) For one thing,
> the difficulty of doing this grows rapidly as the number of arguments to
> the function increases. But perhaps there is some variant of this idea
> that might work...
>
> Another thought is that the estimator could provide information on the
> cost of evaluating the function, the number of tuples produced by the
> function, and even the distribution of those tuples.

Another possibility would be to support default values for all estimator 
functions and have functions called in row context passed DEFAULT, thus 
leaving it up to the estimator writer to supply median values for context 
cases.  Or to simply take the "first" values and use those. 

While any of these possibilites aren't ideal, they are an improvement over the 
current "flat 1000" estimate.   As I said, even the ability to set a 
per-function flat constant estimate would be an improvement.

> BTW, why is this on -performance? It should be on -hackers.

'cause I spend more time reading -performance, and I started the thread.  
Crossed over now.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] 4 way JOIN using aliases

2005-04-10 Thread Josh Berkus

Keith,

> Thanks to all on the NOVICE list that gave me help I now have a query
> running that returns the results I am after. :-)  Now of course I want it
> to run faster.  Currently it clocks in at ~160ms.  I have checked over the
> indexes and I belive that the tables are indexed properly.  The largest
> table, tbl_item, only has 2000 rows.  Is it possible to reduce the time of
> this query further?  

Probably not, no.For a 7-way join including 2 LEFT JOINs on the 
unrestricted contents of all tables, 160ms is pretty darned good.   If these 
tables were large, you'd be looking at a much longer estimation time.   The 
only real way to speed it up would be to find a way to eliminate the left 
joins.  Also, PostgreSQL 8.0 might optimize this query a little better.

The only thing I can see to tweak is that the estimate on the number of rows 
in tbl_item is wrong; probably you need to ANALYZE tbl_item.   But I doubt 
that will make a difference in execution time.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] PLM pulling from CVS nightly for testing in STP

2005-04-13 Thread Josh Berkus

Mark,

> Just wanted everyone to know what we're pulling CVS HEAD nightly so it
> can be tested in STP now.  Let me know if you have any questions.

Way cool.How do I find the PLM number?   How are you nameing these?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Question on REINDEX

2005-04-18 Thread Josh Berkus

Bill,

> 1) When is it necessary to run REINDEX or drop/create
> an index?  All I could really find in the docs is:

If you need to VACUUM FULL, you need to REINDEX as well.  For example, if you 
drop millions of rows from a table.

> 2) If reindexing is necessary, how can this be done in
> a non-obtrusive way in a production environment.  Our
> database is being updated constantly.  REINDEX locks
> client apps out while in progress.  Same with "CREATE
> INDEX" when we drop/create.  The table can have over
> 10 million row.  Recreating the indexes seems to take
> hours.  This is too long to lock the client apps out.
> Is there any other solution?

Better to up your max_fsm_pages and do regular VACUUMs regularly and 
frequently so that you don't have to REINDEX at all.


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus

Tambet,

> Hmm, thanks for a tip. BTW, is output of
>
> select count(1), sum(relpages) from pg_class where relkind in
> ('r','i','t')

Well, if you do that for all databases in the cluster, it's the number you 
start with.  However, setting FSM_pages to that would be assuming that you 
excpected 100% of the rows to be replaced by UPDATES or DELETEs before you 
ran VACUUM.   I generally run VACUUM a little sooner than that.

See the end portion of:
http://www.powerpostgresql.com/PerfList

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Question on REINDEX

2005-04-19 Thread Josh Berkus

Tom,

> Not at all.  What it says is that you expect 100% of the pages to have
> useful amounts of free space, which is a *much* weaker criterion.

Hmmm.  Good point.   

This seems to be another instance where my rule-of-thumb was based on false 
logic but nevertheless arrived at correct numbers.  I've seldom, if ever, set 
FSM_pages above 50% of the pages in the active database ... and never run 
out.

H  actually, it seems like, if you are vacuuming regularly, you only 
*do* need to track pages that have been touched by DELETE or UPDATE.   Other 
pages would have already been vacuumed and not have any useful free space 
left.   Yes?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus

Folks,

Params:  PostgreSQL 8.0.1 on Solaris 10
Statistics = 500
(tablenames have been changed to protect NDA)

e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where 
tablename = 'clickstream1' andattname = 'session_id';
  tablename   | null_frac | correlation | n_distinct
--+---+-+
 clickstream1 | 0 |0.412034 | 378174
(2 rows)

e1=# select count(distinct session_id) from clickstream1;
  count
-
 3174813

As you can see, n_distinct estimation is off by a factor of 10x and it's 
causing query planning problems.   Any suggested hacks to improve the 
histogram on this?

(BTW, increasing the stats to 1000 only doubles n_distinct, and doesn't solve 
the problem)

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-19 Thread Josh Berkus

333,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002}
 
| 
{230,58907,88648,156764,216759,240405,264601,289047,312630,339947,364452,386486,409427,434075,455140,475759,500086,521530,544703,680376,981066,1313419,1712592,1860151,1882452,1905328,1927504,1948159,1970054,1990408,2014501,2038573,2062786,2087163,2110129,2132196,2155657,2181058,2204976,2228575,2256229,2283897,2352453,2407153,2457716,2542081,2572119,2624133,2699592,2771254,2832224,2908151,2951500,3005088,3032889,3137244,3158685,3179395,3203681,3261587,3304359,3325577,3566688,3621357,3645094,3718667,3740821,3762386,3783169,3804593,3826503,3904589,3931012,3957675,4141934,4265118,4288568,4316898,4365625,4473965,4535752,4559700,4691802,4749478,5977208,6000272,6021416,6045939,6078912,6111900,6145155,6176422,6206627,6238291,6271270,6303067,6334117,6365200,6395250,6424719,6888329}
 
| 0.41744


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] What to do with 6 disks?

2005-04-19 Thread Josh Berkus

Jeff,

> Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid
> option would you use for a standalone postgres server?
>
> a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
> b) 1xRAID1 for OS/xlog, 1xRAID5 for data
> c) 1xRAID10 for OS/xlong/data
> d) 1xRAID1 for OS, 1xRAID10 for data
> e) .
>
> I was initially leaning towards b, but after talking to Josh a bit, I
> suspect that with only 4 disks the raid5 might be a performance detriment
> vs 3 raid 1s or some sort of split raid10 setup.

Knowing that your installation is read-heavy, I'd recommend (d), with the WAL 
on the same disk as the OS, i.e.

RAID1 2 disks OS, pg_xlog
RAID 1+0 4 disks pgdata

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Opteron vs Xeon (Was: What to do with 6 disks?)

2005-04-20 Thread Josh Berkus

Anjan,

> Does anyone have any good/bad experiences/recommendations for a 4-way
> Opteron from Sun (v40z, 6 internal drives) or HP (DL585 5 internal
> drives) models?

Last I checked, the v40z only takes 5 drives, unless you yank the cd-rom and 
get an extra disk tray.   That's the main defect of the model, the second 
being its truly phenominal noise level.   Other than that (and price) and 
excellent Opteron machine.

The HPs are at root pretty good machines -- and take 6 drives, so I expect 
you're mixed up there.  However, they use HP's proprietary RAID controller 
which is seriously defective.   So you need to factor replacing the RAID 
controller into the cost.

> This is in comparison with the new Dell 6850 (it has PCIexpress, faster
> FSB 667MHz, which doesn't match up with AMD's total IO bandwidth, but
> much better than previous 6650s).

Yes, but you can still expect the 6650 to have 1/2 the performance ... or 
less ... of the above-name models.   It:
1) is Xeon 32-bit
2) uses a cheap northbridge which makes the Xeon's cache contention even worse
3) depending on the model and options, may ship with a cheap Adaptec raid card 
instead of an LSI or other good card

If all you *need* is 1/2 the performance of an Opteron box, and you can get a 
good deal, then go for it.  But don't be under the illusion that Dell is 
competitive with Sun, IBM, HP, Penguin or Microway on servers.


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-20 Thread Josh Berkus

Tom,

Any thoughts?   This is really messing up query execution all across the 
database ...

--Josh

> Here is the stats = 100 version.   Notice that n_distinct has gone down.
>
>  schemaname |  tablename   |  attname   | null_frac | avg_width |
> n_distinct |   most_common_vals
>
> |most_common_freqs
> | histogram_bounds  |
>
> correlation

>---+- public | web_site_activity_fa |
> session_id | 0 | 8 | 96107 |
> {4393922,6049228,6026260,4394034,60341,4393810,2562999,2573850,3006299,4705
>488,2561499,4705258,3007378,4705490,60327,60352,2560950,2567640,2569852,3006
>604,4394329,2570739,2406633,2407292,3006356,4393603,4394121,6449083,2565815,
>4387881,2406770,2407081,2564340,3007328,2406578,2407295,2562813,2567603,4387
>835,71014,2566253,2566900,6103079,2289424,2407597,2567627,2568333,3457448,23
>450,23670,60743,70739,2406818,2406852,2407511,2562816,3007446,6306095,60506,
>71902,591543,1169136,1447077,2285047,2406830,2573964,6222758,61393,70955,709
>86,71207,71530,262368,2289213,2406899,2567361,2775952,3006824,4387864,623982
>5,6244853,6422152,1739,58600,179293,278473,488407,1896390,2286976,2407020,25
>46720,2677019,2984333,3006133,3007497,3310286,3631413,3801909,4366116,438802
>5}
>
> {0.0017,0.00146667,0.0013,0.0011,0.00093,0.0009,0.0008,0.0008,0.000
>73,0.00073,0.0007,0.00063,0.0006,0.0006,0.00057,0.00057,
>0.00057,0.00057,0.00057,0.00057,0.00057,0.00053,0.00
>05,0.0005,0.0005,0.0005,0.0005,0.0005,0.00047,0.00047,0.00043,0.
>00043,0.00043,0.00043,0.0004,0.0004,0.0004,0.0004,0.0004,0.00036
>6667,0.00037,0.00037,0.00037,0.00033,0.00033,0.00033
>,0.00033,0.00033,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.0003,0.
>0003,0.0003,0.0003,0.00027,0.00027,0.00027,0.00027,0.0002666
>67,0.00027,0.00027,0.00027,0.00027,0.00023,0.00023,0
>.00023,0.00023,0.00023,0.00023,0.00023,0.00023,0.000
>23,0.00023,0.00023,0.00023,0.00023,0.00023,0.0002333
>33,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0
>002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002,0.0002}
>
> {230,58907,88648,156764,216759,240405,264601,289047,312630,339947,364452,38
>6486,409427,434075,455140,475759,500086,521530,544703,680376,981066,1313419,
>1712592,1860151,1882452,1905328,1927504,1948159,1970054,1990408,2014501,2038
>573,2062786,2087163,2110129,2132196,2155657,2181058,2204976,2228575,2256229,
>2283897,2352453,2407153,2457716,2542081,2572119,2624133,2699592,2771254,2832
>224,2908151,2951500,3005088,3032889,3137244,3158685,3179395,3203681,3261587,
>3304359,3325577,3566688,3621357,3645094,3718667,3740821,3762386,3783169,3804
>593,3826503,3904589,3931012,3957675,4141934,4265118,4288568,4316898,4365625,
>4473965,4535752,4559700,4691802,4749478,5977208,6000272,6021416,6045939,6078
>912,6111900,6145155,6176422,6206627,6238291,6271270,6303067,6334117,6365200,
>6395250,6424719,6888329}
>
> | 0.41744

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

> I did not see any marked improvement, but I dont think my issues are
> related to the hardware.

If you won't believe it, then we certainly can't convince you.  AFAIK your bad 
view is a bad query plan made worse by the Dell's hardware problems.

> I am giving up on postgres and three developers two months of work and
> trying MYSQL.

I'd suggest testing your *whole* application and not just this one query.  And 
remember that you need to test InnoDB tables if you want transactions.

>
> I have posted several items and not got a response (not that I expect folks
> to drop everything). I want to thank everyone who has been of help and
> there are several.

Hmmm ... I see about 25 responses to some of your posts on this list.   
Including ones by some of our head developers.   That's more than you'd get 
out of a paid MSSQL support contract, I know from experience.

If you want anything more, then you'll need a "do-or-die" contract with a 
support company. If your frustration is because you can't find this kind of 
help than I completely understand ... I have a waiting list for performance 
contracts myself.  (and, if you hired me the first thing I'd tell you is to 
junk the Dell)

> I really like the environment and feel I have learned a lot in the past few
> months, but bottom line for me is speed. We bought a 30K Dell 6650 to get
> better performance. 

Would have been smart to ask on this list *before* buying the Dell, hey?  Even 
a Google of this mailing list would have been informative.

> I chose PG because MSSQL was 70K to license. I believe 
> the MYSQL will be 250.00 to license for us, but I may choose the 4k
> platinum support just to feel safe about having some one to touch base with
> in the event of an issue.

Hmmm ... you're willing to pay MySQL $4k but expect the PG community to solve 
all your problems with free advice and a couple $100 with CMD?   I sense an 
apples vs. barca loungers comparison here ...

> I am not sure I am walking away feeling real good about
> postgres, because it just should not take a rocket scientist to get it to
> work, and I used to think I was fairly smart and could figure stuff out and
> I hate admitting defeat (especially since we have everything working with
> postgres now).

While I understand your frustration (I've been frustrated more than a few 
times with issues that stump me on Linux, for example) it's extremely unfair 
to lash out at a community that has provided you a lot of free advice because 
the advice hasn't fixed everything yet.  By my reading, you first raised your 
query issue 6 days ago.  6 days is not a lot of time for getting *free* 
troubleshooting help by e-mail. Certainly it's going to take more than 6 days 
to port to MySQL.  

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

> I have MSSQL running on a 2 proc dell which until my load has increased
> (over aprx 2 years) it was just fine. I totally agree that there are better
> solutions based on this lists comments, but I have all Dell hardware now
> and resist trying different vendors just to suit Postgres. I was under the
> impression there were still issues with 64bit postgres and Linux (or at
> least were when I purchased). I believed I could make my next aquistion a
> opteron based hardware.

Yeah, sorry, the Dell stuff is a sore point with me.   You can't imagine the 
number of conversations I have that go like this:
"We're having a severe performance problem with PostgreSQL"
"What hardware/OS are you using?"
"Dell *650 with RHAS 3.0 "

BTW, which Update version is your RHAS?   If you're on Update3, you can grab 
more performance right there by upgrading to Update4.

> Again I am not at all trying to critasize any one, so please except my
> apology if I some how came across with that attitude. I am very
> disappointed at this point. My views may not be that great (although I am
> not saying that either), but they run ok on MSSQL and appear to run ok on
> MYSQL.

Yeah.  I think you'll find a few things that are vice-versa.   For that 
matter, I can point to a number of queries we run better than Oracle, and a 
number we don't.

Your particular query problem seems to stem from some bad estimates.   Can you 
post an EXPLAIN ANALYZE based on all the advice people have given you so far?

> I wish I did understand what I am doing wrong because I do not wish to
> revisit engineering our application for MYSQL.

I can imagine.  

> I would of spent more $ with Command, but he does need my data base to help
> me and I am not able to do that.

Yes.  For that matter, it'll take longer to troubleshoot on this list because 
of your security concerns.

> I agree testing the whole app is the only way to see and unfortunately it
> is a time consuming bit. I do not have to spend 4k on MYSQL, that is if I
> want to have their premium support. I can spend $250.00 a server for the
> commercial license if I find the whole app does run well. I just loaded the
> data last night and only had time to convert one view this morning. I am
> sure it is something I do not understand and not a problem with postgres. I
> also am willing to take time to get more knowledgeable, but my time is
> running out and I feel honestly stupid.

You're not.  You have a real query problem and it will require further 
troubleshooting to solve.  Some of us make a pretty handsome living solving 
these kinds of problems, it take a lot of expert knowledge.

> It was never my intention to make you feel like I was flaming anyone
> involved. On the contrary, I feel many have taken time to look at my
> questions and given excellent advice. I know I check the archives so
> hopefully that time will help others after me.

Well, I overreacted too.   Sorry!

> I may find that revisiting the datasets is a way to make PG work, or as you
> mentioned maybe I can get some one with more knowledge to step in locally.
> I did ask Tom if he knew of anyone, maybe some one else on the list is
> aware of a professional in the Tampa FL area.

Well, Robert Treat is in Florida but I'm pretty sure he's busy full-time.

> Realistically I don't think a 30k$ Dell is a something that needs to be
> junked. I am pretty sure if I got MSSQL running on it, it would outperform
> my two proc box. I can agree it may not have been the optimal platform. My
> decision is not based solely on the performance on the 4 proc box.

Oh, certainly it's too late to buy a Sunfire or eServer instead.   You just 
could have gotten far more bang for the buck with some expert advice, that's 
all.   But don't bother with Dell support any further, they don't really have 
the knowledge to help you.

So ... new EXPLAIN ANALYZE ?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-20 Thread Josh Berkus

Joel,

Ok, please try this:

ALTER TABLE tblresponseheader ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tblresponseheader ALTER COLUMN locationid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN clientnum SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN divisionid SET STATISTICS 1000;
ALTER TABLE tbllocation ALTER COLUMN regionid SET STATISTICS 1000;
ANALYZE tblresponseheader;
ANALYZE tbllocation;

Then run the EXPLAIN ANALYZE again.   (on Linux)


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Bill,

> What about if an out-of-the-ordinary number of rows
> were deleted (say 75% of rows in the table, as opposed
> to normal 5%) followed by a 'VACUUM ANALYZE'?  Could
> things get out of whack because of that situation?

Yes.  You'd want to run REINDEX after and event like that.  As you should now.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How can an index be larger than a table

2005-04-21 Thread Josh Berkus

David,

> What also seems weird to me is that the control table has some unique
> indexes created on it, but the data_upate_events table just has a unique
> constraint.  Will postgres use an index in the background to enforce
> this constraint?

If you somehow have a unique constraint without a unique index, something is 
seriously broken.   I suspect hacking of system tables.

Otherwise, it sounds like you have index bloat due to mass deletions.  Run 
REINDEX, or, preferably, VACUUM FULL and then REINDEX.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Bill,

> Honestly, this seems like an inordinate amount of
> babysitting for a production application.  I'm not
> sure if the client will be willing to accept it.

Well, then, tell them not to delete 75% of the rows in a table at once.  I 
imagine that operation brought processing to a halt, too.

If the client isn't willing to accept the consequences of their own bad data 
management, I'm not really sure what you expect us to do about it.

> Admittedly my knowledge of the inner workings of an
> RDBMS is limited, but could somebody explain to me why
> this would be so?  If you delete a bunch of rows why
> doesn't the index get updated at the same time?  

It does get updated.  What doesn't happen is the space getting reclaimed.  In 
a *normal* data situation, the dead nodes are recycled for new rows.   But 
doing a massive delete operation upsets that, and generally needs to be 
followed by a REINDEX.

> Is 
> this a common issue among all RDBMSs or is it
> something that is PostgreSQL specific?  

Speaking from experience, this sort of thing affects MSSQL as well, although 
the maintenance routines are different.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Bill,

> Honestly, this seems like an inordinate amount of
> babysitting for a production application.  I'm not
> sure if the client will be willing to accept it.

Well, then, tell them not to delete 75% of the rows in a table at once.  I 
imagine that operation brought processing to a halt, too.

> Admittedly my knowledge of the inner workings of an
> RDBMS is limited, but could somebody explain to me why
> this would be so?  If you delete a bunch of rows why
> doesn't the index get updated at the same time?  

It does get updated.  What doesn't happen is the space getting reclaimed.  In 
a *normal* data situation, those dead nodes would be replaced with new index 
nodes.   However, a mass-delete-in-one-go messes that system up.

> Is 
> this a common issue among all RDBMSs or is it
> something that is PostgreSQL specific?  

Speaking from experience, this sort of thing affects MSSQL as well, although 
the maintenance routines are different.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Alex,

> REINDEX DATABASE blah
>
> supposed to rebuild all indices in the database, or must you specify
> each table individualy? (I'm asking because I just tried it and it
> only did system tables)

"DATABASE

 Recreate all system indexes of a specified database. Indexes on user tables 
are not processed. Also, indexes on shared system catalogs are skipped except 
in stand-alone mode (see below). "

http://www.postgresql.org/docs/8.0/static/sql-reindex.html

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Michael,

> Every five minutes, DBCC INDEXDEFRAG will report to the user an
> estimated percentage completed. DBCC INDEXDEFRAG can be terminated at
> any point in the process, and *any completed work is retained.*"

Keen.  Sounds like something for our TODO list.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Dave,

> See http://archives.postgresql.org/pgsql-general/2005-03/msg01465.php
> for my thoughts on a non-blocking alternative to REINDEX.  I got no
> replies to that message. :-(

Well, sometimes you have to be pushy.   Say, "Hey, comments please?"

The hackers list is about 75 posts a day, it's easy for people to lose track 
of stuff they meant to comment on.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Index bloat problem?

2005-04-21 Thread Josh Berkus

Bill,

> If I deleted 75% of the rows but had a max_fsm_pages
> setting that still exceeded the pages required (as
> indicated in VACUUM output), would that solve my
> indexing problem or would I still need to REINDEX
> after such a purge?

Depends on the performance you're expecting.The FSM relates the the re-use 
of nodes, not taking up free space.   So after you've deleted 75% of rows, 
the index wouldn't shrink.  It just wouldn't grow when you start adding rows.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-22 Thread Josh Berkus

Marko,

> Sometimes, if the random number generator, that PostgreSQL uses,
> isn't good enough, the randomly selected pages for the statistics
> might not be random enough.
>
> Solaris is unknown to me. Maybe the used random number generator there
> isn't good enough?

Hmmm.  Good point.  Will have to test on Linux.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-22 Thread Josh Berkus


> > Solaris is unknown to me. Maybe the used random number generator there
> > isn't good enough?
>
> Hmmm.  Good point.  Will have to test on Linux.

Nope:

Linux 2.4.20:

test=# select tablename, attname, n_distinct from pg_stats where tablename = 
'web_site_activity_fa';
  tablename   |   attname   | n_distinct
--+-+
 web_site_activity_fa | session_id  | 626127

test=# select count(distinct session_id) from web_site_activity_fa;
  count
-
 3174813
(1 row)

... I think the problem is in our heuristic sampling code.  I'm not the first 
person to have this kind of a problem.  Will be following up with tests ...

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Updating table, precautions?

2005-04-22 Thread Josh Berkus

Anjan,

> This is what I used to find the table size, which probably doesn't
> include the index size. Is there a way to find out size of indexes?
>
> select relpages * 8192 as size_in_bytes from pg_class where relnamespace
> = (select oid from pg_namespace where nspname = 'public') and relname =
> 'r_itemcategory';

See the code in CVS in the "newsysviews" project in pgFoundry.   Andrew coded 
up a nice pg_user_table_storage view which gives table, index and TOAST size.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Joel's Performance Issues WAS : Opteron vs Xeon

2005-04-23 Thread Josh Berkus

Jim, Kevin,

> > Hrm... I was about to suggest that for timing just the query (and not
> > output/data transfer time) using explain analyze, but then I remembered
> > that explain analyze can incur some non-trivial overhead with the timing
> > calls. Is there a way to run the query but have psql ignore the output?
> > If so, you could use \timing.
>
> Would timing "SELECT COUNT(*) FROM (query)" work?

Just \timing would work fine; PostgreSQL doesn't return anything until it has 
the whole result set.  That's why MSSQL vs. PostgreSQL timing comparisons are 
deceptive unless you're careful:  MSSQL returns the results on block at a 
time, and reports execution time as the time required to return the *first* 
block, as opposed to Postgres which reports the time required to return the 
whole dataset.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-23 Thread Josh Berkus

Greg,

> I looked into this a while back when we were talking about changing the
> sampling method. The conclusions were discouraging. Fundamentally, using
> constant sized samples of data for n_distinct is bogus. Constant sized
> samples only work for things like the histograms that can be analyzed
> through standard statistics population sampling which depends on the law of
> large numbers.

Well, unusual distributions are certainly tough.  But I think the problem 
exists even for relatively well-distributed populations.Part of it is, I 
believe, the formula we are using:

n*d / (n - f1 + f1*n/N)

This is an estimation formula from Haas and Stokes in IBM Research Report RJ 
10025, and is called the DUJ1 formula. 
(http://www.almaden.ibm.com/cs/people/peterh/jasa3rj.pdf)  It appears to 
suck.   For example, take my table:

rows: 26million (N)
distinct values: 3.4million

I took a random sample of 1000 rows (n) from that table.   It contained:
968 values that occurred only once (f1)
981 distinct values (d)

Any human being looking at that sample would assume a large number of distinct 
values; after all, 96.8% of the values occurred only once.   But the formula 
gives us:

1000*981 / ( 1000 - 968 + ( 968 * 1000/2600 ) ) = 30620

This is obviously dramatically wrong, by a factor of 100.  The math gets worse 
as the sample size goes down:

Sample 250, 248 distinct values, 246 unique values:

250*248 / ( 250 - 246 + ( 246 * 250 / 2600 ) ) = 15490

Even in a case with an ovewhelming majority of unique values, the formula gets 
it wrong:

999 unque values in sample
998 appearing only once:

1000*999 / ( 1000 - 998 + ( 998 * 1000 / 2600 ) ) = 490093

This means that, with a sample size of 1000 a table of 26million rows cannot 
ever have with this formula more than half a million distinct values, unless 
the column is a unique column.

Overall, our formula is inherently conservative of n_distinct.   That is, I 
believe that it is actually computing the *smallest* number of distinct 
values which would reasonably produce the given sample, rather than the 
*median* one.  This is contrary to the notes in analyze.c, which seem to 
think that we're *overestimating* n_distinct.  

This formula appears broken everywhere:

Table: 969000 rows
Distinct values: 374000
Sample Size: 1000
Unique values in sample: 938
Values appearing only once: 918

1000*938 / ( 1000 - 918 + ( 918 * 1000 / 969000 ) ) = 11308

Again, too small by a factor of 20x.   

This is so broken, in fact, that I'm wondering if we've read the paper right?  
I've perused the paper on almaden, and the DUJ1 formula appears considerably 
more complex than the formula we're using.  

Can someone whose math is more recent than calculus in 1989 take a look at 
that paper, and look at the formula toward the bottom of page 10, and see if 
we are correctly interpreting it?I'm particularly confused as to what "q" 
and "d-sub-n" represent.  Thanks!

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-23 Thread Josh Berkus

People,

> Can someone whose math is more recent than calculus in 1989 take a look at
> that paper, and look at the formula toward the bottom of page 10, and see
> if we are correctly interpreting it?    I'm particularly confused as to
> what "q" and "d-sub-n" represent.  Thanks!

Actually, I managed to solve for these and it appears we are using the formula 
correctly.  It's just a bad formula.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-24 Thread Josh Berkus

Andrew,

> The math in the paper does not seem to look at very low levels of q (=
> sample to pop ratio).

Yes, I think that's the failing.   Mind you, I did more testing and found out 
that for D/N ratios of 0.1 to 0.3, the formula only works within 5x accuracy 
(which I would consider acceptable) with a sample size of 25% or more (which 
is infeasable in any large table).The formula does work for populations 
where D/N is much lower, say 0.01.  So overall it seems to only work for 1/4 
of cases; those where n/N is large and D/N is low.   And, annoyingly, that's 
probably the population where accurate estimation is least crucial, as it 
consists mostly of small tables.

I've just developed (not original, probably, but original to *me*) a formula 
that works on populations where n/N is very small and D/N is moderate (i.e. 
0.1 to 0.4):

N * (d/n)^(sqrt(N/n))

However, I've tested it only on (n/N < 0.005 and D/N > 0.1 and D/N < 0.4) 
populations, and only 3 of them to boot.   I'd appreciate other people trying 
it on their own data populations, particularly very different ones, like D/N 
> 0.7 or D/N < 0.01.

Further, as Andrew points out we presumably do page sampling rather than 
purely random sampling so I should probably read the paper he referenced.  
Working on it now 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-24 Thread Josh Berkus

Folks,

> I wonder if this paper has anything that might help:
> http://www.stat.washington.edu/www/research/reports/1999/tr355.ps - if I
> were more of a statistician I might be able to answer :-)

Actually, that paper looks *really* promising.   Does anyone here have enough 
math to solve for D(sub)Md on page 6?   I'd like to test it on samples of < 
0.01%.

Tom, how does our heuristic sampling work?   Is it pure random sampling, or 
page sampling?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Josh Berkus

Simon, Tom:

While it's not possible to get accurate estimates from a fixed size sample, I 
think it would be possible from a small but scalable sample: say, 0.1% of all 
data pages on large tables, up to the limit of maintenance_work_mem.  

Setting up these samples as a % of data pages, rather than a pure random sort, 
makes this more feasable; for example, a 70GB table would only need to sample 
about 9000 data pages (or 70MB).  Of course, larger samples would lead to 
better accuracy, and this could be set through a revised GUC (i.e., 
maximum_sample_size, minimum_sample_size).   

I just need a little help doing the math ... please?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-25 Thread Josh Berkus

Guys,

> While it's not possible to get accurate estimates from a fixed size sample,
> I think it would be possible from a small but scalable sample: say, 0.1% of
> all data pages on large tables, up to the limit of maintenance_work_mem.

BTW, when I say "accurate estimates" here, I'm talking about "accurate enough 
for planner purposes" which in my experience is a range between 0.2x to 5x.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] half the query time in an unnecessary(?) sort?

2005-04-25 Thread Josh Berkus

Ron,

> If I have a freshly CLUSTERed table and queries that want to do a
> merge join, it seems to me that quite a bit of time is spent
> unnecessarily sorting the already-sorted table. An example such
> query I found in my log files is shown below. If I read the
> EXPLAIN ANALYZE output correctly, it's saying that roughly half
> the time (570-269 = 300 out of 670 ms) was spent sorting the
> already sorted data.

It still has to sort because the clustering isn't guarenteed to be 100%.   
However, such sorts should be very quick as they have little work to do.

Looking at your analyze, though, I think it's not the sort that's taking the 
time as it is that the full sorted entity_id column won't fit in work_mem.  
Try increasing it?


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Table Partitioning: Will it be supported in Future?

2005-04-26 Thread Josh Berkus

Richard,

> I believe these are being worked on at the moment. You might want to
> search the archives of the hackers mailing list to see if the plans will
> suit your needs.

Actually, this is being discussed through the Bizgres project: 
www.bizgres.org.

However, I agree that a 1GB table is not in need of partitioning.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-26 Thread Josh Berkus

Simon,

> Could it be that we have overlooked this simple explanation and that the
> Haas and Stokes equation is actually quite good, but just not being
> applied?

That's probably part of it, but I've tried Haas and Stokes on a pure random 
sample and it's still bad, or more specifically overly conservative.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-04-27 Thread Josh Berkus

Mischa,

> >Perhaps I can save you some time (yes, I have a degree in Math). If I
> >understand correctly, you're trying extrapolate from the correlation
> >between a tiny sample and a larger sample. Introducing the tiny sample
> >into any decision can only produce a less accurate result than just
> >taking the larger sample on its own; GIGO. Whether they are consistent
> >with one another has no relationship to whether the larger sample
> >correlates with the whole population. You can think of the tiny sample
> >like "anecdotal" evidence for wonderdrugs.

Actually, it's more to characterize how large of a sample we need.  For 
example, if we sample 0.005 of disk pages, and get an estimate, and then 
sample another 0.005 of disk pages and get an estimate which is not even 
close to the first estimate, then we have an idea that this is a table which 
defies analysis based on small samples.   Wheras if the two estimates are < 
1.0 stdev apart, we can have good confidence that the table is easily 
estimated.  Note that this doesn't require progressively larger samples; any 
two samples would work.

> I'm with Tom though in being very wary of solutions that require even
> one-off whole table scans. Maybe we need an additional per-table
> statistics setting which could specify the sample size, either as an
> absolute number or as a percentage of the table. It certainly seems that
> where D/N ~ 0.3, the estimates on very large tables at least are way way
> out.

Oh, I think there are several other cases where estimates are way out.  
Basically the estimation method we have doesn't work for samples smaller than 
0.10.   

> Or maybe we need to support more than one estimation method.

Yes, actually.   We need 3 different estimation methods:
1 for tables where we can sample a large % of pages (say, >= 0.1)
1 for tables where we sample a small % of pages but are "easily estimated"
1 for tables which are not easily estimated by we can't afford to sample a 
large % of pages.

If we're doing sampling-based estimation, I really don't want people to lose 
sight of the fact that page-based random sampling is much less expensive than 
row-based random sampling.   We should really be focusing on methods which 
are page-based.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Final decision

2005-04-27 Thread Josh Berkus

Joel,

> So I am planning on sticking with postgres fro our production database
> (going live this weekend).

Glad to have you.

> I did not find any resolutions to my issues with Commandprompt.com (we only
> worked together 2.5 hours).

BTW, your performance troubleshooting will continue to be hampered if you 
can't share actual queries and data structure.   I strongly suggest that you 
make a confidentiality contract with  a support provider so that you can give 
them detailed (rather than general) problem reports.

> Most of my application is working about the same speed as MSSQL server
> (unfortunately its twice the speed box, but as many have pointed out it
> could be an issue with the 4 proc dell). I spent considerable time with
> Dell and could see my drives are delivering 40 meg per sec.

FWIW, on a v40z I get 180mb/s.   So your disk array on the Dell is less than 
ideal ... basically, what you have is a more expensive box, not a faster 
one :-(

> Things I still have to make better are my settings in config, I have it set
> to no merge joins and no seq scans.

Yeah, I'm also finding that our estimator underestimates the real cost of 
merge joins on some systems.Basically we need a sort-cost variable, 
because I've found an up to 2x difference in sort cost depending on 
architecture.

> I am going to have to use flattened history files for reporting (I saw huge
> difference here the view for audit cube took 10 minutes in explain analyze
> and the flattened file took under one second).
> I understand both of these practices are not desirable, but I am at a place
> where I have to get it live and these are items I could not resolve.

Flattening data for reporting is completely reasonable; I do it all the time.

> I believe that was totally IIS not postgres, but I am curious as to if
> using postgres odbc will put more stress on the IIS side then MSSQL did.

Actually, I think the problem may be ODBC.   Our ODBC driver is not the best 
and is currently being re-built from scratch.   Is using npgsql, a much 
higher-performance driver (for .NET) out of the question?  According to one 
company, npgsql performs better than drivers supplied by Microsoft.

> I did have a question if any folks are using two servers one for reporting
> and one for data entry what system should be the beefier?

Depends on the relative # of users.This is often a good approach, because 
the requirements for DW reporting and OLTP are completely different.  
Basically:
OLTP: Many slow processors, disk array set up for fast writes, moderate shared 
mem, low work_mem.
DW: Few fast processors, disk array set up for fast reads, high shared mem and 
work mem.

If reporting is at least 1/4 of your workload, I'd suggest spinning that off 
to the 2nd machine before putting one client on that machine.    That way you 
can also use the 2nd machine as a failover back-up.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Final decision

2005-04-27 Thread Josh Berkus

Dave,

> > Actually, I think the problem may be ODBC.   Our ODBC driver
> > is not the best
> > and is currently being re-built from scratch.
>
> It is? No-one told the developers...
>
> Regards, Dave
>
> [and yes, I know Joshua said Command Prompt are rewriting /their/
> driver]

OK.   Well, let's put it this way:  the v3 and v3.5 drivers will not be based 
on the current driver, unless you suddenly have a bunch of free time.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Why is this system swapping?

2005-04-27 Thread Josh Berkus

Greg,

> In fact I think it's generally superior to having a layer like pgpool
> having to hand off all your database communication. Having to do an extra
> context switch to handle every database communication is crazy.

Although, one of their issues is that their database connection pooling is 
per-server.Which means that a safety margin of pre-allocated connections 
(something they need since they get bursts of 1000 new users in a few 
seconds) has to be maintained per server, increasing the total number of 
connections.   

So a pooling system that allowed them to hold 100 free connections centrally 
rather than 10 per server might be a win.

Better would be getting some of this stuff offloaded onto database replication 
slaves.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Final decision

2005-04-27 Thread Josh Berkus

Dave, folks,

> Err, yes. But that's not quite the same as core telling us the current
> driver is being replaced.

Sorry, I spoke off the cuff.I also was unaware that work on the current 
driver had renewed.   Us Core people are not omnicient, believe it or don't.

Mind you, having 2 different teams working on two different ODBC drivers is a 
problem for another list ...

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Distinct-Sampling (Gibbons paper) for Postgres

2005-04-28 Thread Josh Berkus


> Now, if we can come up with something better than the ARC algorithm ...

Tom already did.  His clock-sweep patch is already in the 8.1 source.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] batch inserts are "slow"

2005-05-03 Thread Josh Berkus

People,

> There are several hacks floating around that add COPY capabilities to
> the pgjdbc driver. As they all are rather simple hacks, they have not
> been included in the cvs yet, but they tend to work fine.

FWIW, Dave Cramer just added beta COPY capability to JDBC.   Contact him on 
the JDBC list for details; I think he needs testers.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-05-03 Thread Josh Berkus

Mischa,

> Okay, although given the track record of page-based sampling for
> n-distinct, it's a bit like looking for your keys under the streetlight,
> rather than in the alley where you dropped them :-)

Bad analogy, but funny.

The issue with page-based vs. pure random sampling is that to do, for example, 
10% of rows purely randomly would actually mean loading 50% of pages.  With 
20% of rows, you might as well scan the whole table.

Unless, of course, we use indexes for sampling, which seems like a *really 
good* idea to me 

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PERFORM] Bad n_distinct estimation; hacks suggested?

2005-05-03 Thread Josh Berkus

John,

> But doesn't an index only sample one column at a time, whereas with
> page-based sampling, you can sample all of the columns at once. 

Hmmm.  Yeah, we're not currently doing that though.  Another good idea ...

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Whence the Opterons?

2005-05-07 Thread Josh Berkus

A-

> On Fri, May 06, 2005 at 02:39:11PM -0700, Mischa Sandberg wrote:
> > IBM, Sun and HP have their fairly pricey Opteron systems.
>
> We've had some quite good experiences with the HP boxes.  They're not
> cheap, it's true, but boy are they sweet.

Question, though: is HP still using their proprietary RAID card?   And, if so, 
have they fixed its performance problems?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Query tuning help

2005-05-08 Thread Josh Berkus

Dan,

> and (  recordtext like '%RED%' or recordtext like '%CORVETTE%'  )  

It is simply not possible to use B-tree indexes on these kind of text queries.  
B-trees require you to start at the "left" side of the field, because B-trees 
locate records via <> tests.  "Anywhere in the field" text search requires a 
Full Text Index.

> The reason for the redundant LIKE clause is that first, I only want
> those "incidentid"s that contain the words 'RED' and 'CORVETTE'.  BUT,
> those two words may exist across multiple records with the same
> incidentid.  Then, I only want to actually work with the rows that
> contain one of the words.  This query will repeat the same logic for
> however many keywords are entered by the user.  I have investigated
> text searching options and have not found them to be congruous with my
> application.

Sounds like you either need to restructure your application, restructure your 
database (so that you're not doing "anywhere in field" searches), or buy 32GB 
of ram so that you can cache the whole table.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Query tuning help

2005-05-08 Thread Josh Berkus

Dan,

> While I believe you, I'm confused by this line in my original EXPLAIN
>
> ANALYZE:
> >> ->  Index Scan using ea1 on ea  (cost=0.00..2736.43 rows=42 width=47)
> >> (actual time=2.085..2.309 rows=2 loops=473)
> >> Index Cond:
> >> ((ea.incidentid)::text = ("outer".incidentid)::text)
> >> Filter: (((recordtext)::text
> >> ~~ '%RED%'::text) OR ((recordtext)::text ~~ '%CORVETTE%'::text))

The index named is matching based on incidentid -- the join condition.  The 
"filter" is applied against the table rows, i.e. a scan.

> If I were to use tsearch2 for full-text indexing, would I need to
> create another table that merges all of my recordtext rows into a
> single 'text' field type?  

No.   Read the OpenFTS docs, they are fairly clear on how to set up a simple 
FTS index. (TSearch2 ~~ OpenFTS)

> If so, this is where I run into problems, as 
> my logic also needs to match multiple words in their original order.  

You do that by doubling up ... that is, use the FTS index to pick all rows 
that contain "RED" and "CORVETTE", and then check the order.  I'll also note 
that your current query is not checking word order. 

Example:
WHERE recordtext_fti @@ to_tsquery ('default', 'RED && CORVETTE')
AND recordtext LIKE '%RED%CORVETTE%'

I'm doing something fairly similar on one of my projects and it works very 
well.

The limitations on TSearch2 indexes are:
1) they are expensive to update, so your data loads would be noticably slower.  
2) they are only fast when cached in RAM (and when cached, are *very* fast).  
So if you have a variety of other processes that tend to fill up RAM between 
searches, you may find them less useful.
3) You have to create a materialized index column next to recordtext, which 
will increase the size of the table.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Josh Berkus

Alex,

> This is why I mention partitioning. It solves this issue by storing  
> different data sets on different machines under the same schema.  

That's clustering, actually.  Partitioning is simply dividing up a table into 
chunks and using the chunks intelligently.   Putting those chunks on seperate 
machines is another thing entirely.  

We're working on partitioning through the Bizgres sub-project:
www.bizgres.org  / http://pgfoundry.org/projects/bizgres/
... and will be pushing it to the main PostgreSQL when we have something.

I invite you to join the mailing list.

> These seperate chunks of the table can then be replicated as well for  
> data redundancy and so on. MySQL are working on these things, 

Don't hold your breath.   MySQL, to judge by their first "clustering" 
implementation, has a *long* way to go before they have anything usable.  In 
fact, at OSCON their engineers were asking Jan Wieck for advice.

If you have $$$ to shell out, my employer (GreenPlum) has a multi-machine 
distributed version of PostgreSQL.  It's proprietary, though.  
www.greenplum.com.

If you have more time than money, I understand that Stanford is working on 
this problem:
http://www-db.stanford.edu/~bawa/

But, overall, some people on this list are very mistaken in thinking it's an 
easy problem.   GP has devoted something like 5 engineers for 3 years to 
develop their system.  Oracle spent over $100 million to develop RAC.  

> but PG   
> just has a bunch of third party extensions, I wonder why these are  
> not being integrated into the main trunk :/ 

Because it represents a host of complex functionality which is not applicable 
to most users?  Because there are 4 types of replication and 3 kinds of 
clusering and not all users want the same kind?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Partitioning / Clustering

2005-05-10 Thread Josh Berkus

Neil,

> Sure, but that hardly makes it not "usable". Considering the price of
> RAM these days, having enough RAM to hold the database (distributed over
> the entire cluster) is perfectly acceptable for quite a few people.

The other problem, as I was told it at OSCON, was that these were not 
high-availability clusters; it's impossible to add a server to an existing 
cluster, and a server going down is liable to take the whole cluster down.  
Mind you, I've not tried that aspect of it myself; once I saw the ram-only 
rule, we switched to something else.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Partitioning / Clustering

2005-05-11 Thread Josh Berkus

David,

> It's interesting that the solution livejournal have arrived at is quite
> similar in ways to the way google is set up.

Yes, although again, they're using memcached as pseudo-clustering software, 
and as a result are limited to what fits in RAM (RAM on 27 machines, but it's 
still RAM).  And due to limitations on memcached, the whole thing blows 
whenever a server goes out (the memcached project is working on this).  But 
any LJ user could tell you that it's a low-availability system.

However, memcached (and for us, pg_memcached) is an excellent way to improve 
horizontal scalability by taking disposable data (like session information) 
out of the database and putting it in protected RAM.  On some websites, 
adding memcached can result is as much as a 60% decrease in database traffic.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Bad plan after vacuum analyze

2005-05-11 Thread Josh Berkus

Guillaume,

> We reproduced the problem on a 7.4.5 and on a 7.4.7 server.
> * we load the dump in a new database
> * query: it's fast (< 1ms)
> * VACUUM FULL ANALYZE;
> * query: it's really slow (130ms) and it's another plan
> * set enable_seqscan=off;
> * query: it's fast (< 1ms) : it uses the best plan

Looking at this, the planner seems convinced that the merge join is the 
easiest way to do the OUTER JOINS, but it appears to be wrong; a nested loop 
is faster.

This isn't the only place I've encountered our optimizer doing this -- 
underestimating the cost of a merge join.  This seems to be becuase the 
merge_join vs. nested_loop decision seems to be being made in the planner 
without taking the double-sort and index access costs into account.   This 
query is an excellent example:

"good" plan:
 Nested Loop Left Join  (cost=2.44..17.36 rows=1 width=5532) (actual 
time=0.441..0.466 rows=1 loops=1)
   Join Filter: ("outer".parent_application_id = "inner".application_id)
   ->  Nested Loop Left Join  (cost=2.44..15.73 rows=1 width=5214) (actual 
time=0.378..0.402 rows=1 loops=1)

See, here the planner thinks that the 2 nested loops will cost "35".  

"bad" plan:
 Merge Right Join  (cost=9.27..9.48 rows=1 width=545) (actual 
time=129.364..129.365 rows=1 loops=1)
   Merge Cond: ("outer".application_id = "inner".parent_application_id)
   ->  Index Scan using applicati_applicati_id_p_ogstm on applications t116  
(cost=0.00..5.51 rows=28 width=20) (actual time=0.030..0.073 rows=28 loops=1)
   ->  Sort  (cost=9.27..9.27 rows=1 width=529) (actual time=129.202..129.203 
rows=1 loops=1)
 Sort Key: t22.parent_application_id
 ->  Merge Right Join  (cost=8.92..9.26 rows=1 width=529) (actual 
time=129.100..129.103 rows=1 loops=1)
   Merge Cond: ("outer".object_id = "inner".parent_application_id)
   ->  Index Scan using acs_objects_object_id_p_hhkb1 on 
acs_objects t98  (cost=0.00..2554.07 rows=33510 width=81) (actual 
time=0.043..56.392 rows=33510 loops=1)
   ->  Sort  (cost=8.92..8.93 rows=1 width=452) (actual 
time=0.309..0.310 rows=1 loops=1)
 Sort Key: t22.parent_application_id

Here the planner chooses a merge right join.  This decision seems to have been 
made entirely on the basis of the cost of the join itself (total of 17) 
without taking the cost of the sort and index access (total of 2600+) into 
account.

Tom, is this a possible error in planner logic?


-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] tuning Postgres for large data import (using Copy from)

2005-05-12 Thread Josh Berkus

Marc,

> 1) use 3 different disks for:
>
>   -1: source data
>   -2: index tablespaces
>   -3: data tablespaces

Others have already told you about the importance of relocating WAL.  If you 
are going to be building indexes on the imported data, you might find it 
beneficial to relocate pgsql_tmp for the database in question as well.   
Also, I generally find it more beneficial to seperate the few largest tables 
to their own disk resources than to put all tables on one resource and all 
disks on another.  For example, for TPCH-like tests, I do
array0: OS and pgsql_tmp
array1: LINEITEM
array2: LINEITEM Indexes
array3: all other tables and indexes
array4: pg_xlog
array5: source data

This allows me to load a 100G (actually 270G) TPCH-like database in < 2 hours, 
not counting index-building.

> 2) define all foreign keys as initially deferred

It would be better to drop them before import and recreate them afterwards.  
Same for indexes unless those indexes are over 2G in size.

>   max_connections =20
>   shared_buffers =3
>   work_mem = 8192

Not high enough, unless you have very little RAM.  On an 8G machine I'm using 
256MB.  You might want to use 64MB or 128MB.

>   maintenance_work_mem = 32768

REALLY not high enough.  You're going to need to build big indexes and 
possibly vacuum large tables.  I use the maximum of 1.98GB.  Use up to 1/3 of 
your RAM for this.

>   checkpoint_segments = 12

Also way too low.  Put pg_xlog on its own disk, give in 128 to 512 segments 
(up to 8G).

> The server runs RedHat and has 1GB RAM

Make sure you're running a 2.6.10+ kernel.  Make sure ext3 is set noatime, 
data=writeback.  Buy more RAM.  Etc.

> How does Postgres handle concurrent copy from on: same table / different
> tables ?

Same table is useless; the imports will effectively serialize  (unless you use 
pseudo-partitioning).  You can parallel load on multiple tables up to the 
lower of your number of disk channels or number of processors.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Josh Berkus

People,

> In general I think your point is valid. Just remember that it probably
> also matters how you count page views. Because technically images are a
> separate page (and this thread did discuss serving up images). So if
> there are 20 graphics on a specific page, that is 20 server hits just
> for that one page.

Also, there's bots and screen-scrapers and RSS, web e-mails, and web services 
and many other things which create hits but are not "people".  I'm currently 
working on clickstream for a site which is nowhere in the top 100, and is 
getting 3 million real hits a day ... and we know for a fact that at least 
1/4 of that is bots.

Regardless, the strategy you should be employing for a high traffic site is 
that if your users hit the database for anything other than direct 
interaction (like filling out a webform) then you're lost.Use memcached, 
squid, lighttpd caching, ASP.NET caching, pools, etc.   Keep the load off the 
database except for the stuff that only the database can do.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Partitioning / Clustering

2005-05-12 Thread Josh Berkus

Ross,

> Memcached is a PG memory store, I gather,

Nope.  It's a hyperfast resident-in-memory hash that allows you to stash stuff 
like user session information and even materialized query set results.  
Thanks to SeanC, we even have a plugin, pgmemcached.

> but...what is squid, lighttpd? 
> anything directly PG-related?

No.   These are all related to making the web server do more.   The idea is 
NOT to hit the database every time you have to serve up a web page, and 
possibly not to hit the web server either.  For example, you can use squid 3 
for "reverse" caching in front of your web server, and serve far more page 
views than you could with Apache alone.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Recommendations for set statistics

2005-05-13 Thread Josh Berkus

Chris,

> It is widely believed that a somewhat larger default than 10 would be
> a "good thing," as it seems to be fairly common for 10 to be too small
> to allow statistics to be stable.  But nobody has done any formal
> evaluation as to whether it would make sense to jump from 10 to:
>
>  - 15?
>  - 20?
>  - 50?
>  - 100?
>  - More than that?

My anecdotal experience is that if more than 10 is required, you generally 
need to jump to at least 100, and more often 250.   On the other end, I've 
generally not found any difference between 400 and 1000 when it comes to 
"bad" queries.

I have an unfinished patch in the works which goes through and increases the 
stats_target for all *indexed* columns to 100 or so.   However, I've needed 
to work up a test case to prove the utility of it.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgresql Performance via the LSI MegaRAID 2x Card

2005-05-13 Thread Josh Berkus

Steve,

> Past recommendations for a good RAID card (for SCSI) have been the LSI
> MegaRAID 2x. This unit comes with 128MB of RAM on-board. Has anyone
> found by increasing the on-board RAM, did Postgresql performed better?

My informal tests showed no difference between 64MB and 256MB.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] ok you all win what is best opteron (I dont want a hosed system again)

2005-05-14 Thread Josh Berkus

Joel,

> The two worst queries (our case and audit applications) I created
> denormalized files and maintain them through code. All reporting comes off
> those and it is lightning fast.

This can often be called for.  I'm working on a 400GB data warehouse right 
now, and almost *all* of our queries run from materialized aggregate tables.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] Postgresql Performance via the LSI MegaRAID 2x Card

2005-05-15 Thread Josh Berkus

William,

> I'm sure there's some corner case where more memory helps.

QUite possibly.   These were not scientific tests.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] checkpoint segments

2005-05-15 Thread Josh Berkus

David,

> I've seen these messages in the log before, and am aware of the need to
> increase checkpoint_segments, but I wasn't aware that recycling a
> transaction log could be that damaging to performance. There may have
> been some local hiccup in this case, but I'm wondering if recycling is
> known to be a big hit in general, and if I should strive to tune so that
> it never happens (if that's possible)?

Yes, and yes.   Simply allocating more checkpoint segments (which can eat a 
lot of disk space -- requirements are 16mb*(2 * segments +1) ) will prevent 
this problem.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] checkpoint segments

2005-05-16 Thread Josh Berkus

Alvaro,

> > Yes, and yes.   Simply allocating more checkpoint segments (which can eat
> > a lot of disk space -- requirements are 16mb*(2 * segments +1) ) will
> > prevent this problem.
>
> Hmm?  I disagree -- it will only make things worse when the checkpoint
> does occur.

Unless you allocate enough logs that you don't need to checkpoint until the 
load is over with.   In multiple data tests involving large quantities of 
data loading, increasing the number of checkpoints and the checkpoint 
interval has been an overall benefit to overall load speed.   It's possible 
that the checkpoints which do occur are worse, but they're not enough worse 
to counterbalance their infrequency.

I have not yet been able to do a full scalability series on bgwriter.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [PERFORM] Tuning planner cost estimates

2005-05-19 Thread Josh Berkus

Jim,

> I've been doing some work to try and identify the actual costs
> associated with an index scan with some limited sucess. What's been run
> so far can be seen at http://stats.distributed.net/~decibel. But there's
> a couple problems. First, I can't use the box exclusively for this
> testing, which results in some result inconsistencies.

I can get you access to boxes.  Chat on IRC?

> Second, I've been 
> using a dataset that I can't make public, which means no one else can
> run these tests on different hardware.

Then use one of the DBT databases.

> In the
> case of testing index scans, we need to be able to vary correlation,
> which so far I've been doing by ordering by different columns. I suspect
> it will also be important to test with different tuple sizes. There's
> also the question of whether or not the cache should be flushed for each
> run or not.
>
> Does this sound like a good way to determine actual costs for index
> scans (and hopefully other access methods in the future)? If so, what
> would be a good way to implement this?

Well, the problem is that what we need to index scans is a formula, rather 
than a graph.   The usefulness of benchmarking index scan cost is so that we 
can test our formula for accuracy and precision.  However, such a formula 
*does* need to take into account concurrent activity, updates, etc ... that 
is, it needs to approximately estimate the relative cost on a live database, 
not a test one.

This is also going to be a moving target because Tom's in-memory-bitmapping 
changes relative cost equations.

I think a first step would be, in fact, to develop a tool that allows us to 
put EXPLAIN ANALYZE results in a database table.  Without that, there is no 
possibility of statistical-scale analysis.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Tuning planner cost estimates

2005-05-19 Thread Josh Berkus

Tom,

>   for rec in explain analyze ... loop
>   insert into table values(rec."QUERY PLAN");
>   end loop;

I need to go further than that and parse the results as well.  And preserve 
relationships and nesting levels.   

H ... what's the indenting formula for nesting levels?

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] PostgreSQL strugling during high load

2005-05-19 Thread Josh Berkus

Anjan,

> As far as disk I/O is concerned for flushing the buffers out, I am not
> ruling out the combination of Dell PERC4 RAID card, and the RH AS 3.0
> Update3 being a problem.

You know that Update4 is out, yes?  
Update3 is currenly throttling your I/O by about 50%.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] seqential vs random io

2005-05-23 Thread Josh Berkus

David,

> > I just got a question from one our QA guys who is configuring a RAID 10
> > disk that is destined to hold a postgresql database. The disk
> > configuration procedure is asking him if he wants to optimize for
> > sequential or random access. My first thought is that random is what we
> > would want, but then I started wondering if it's not that simple, and my
> > knowledge of stuff at the hardware level is, well, limited.
> >
> > If it were your QA guy, what would you tell him?

Depends on the type of database.  OLTP or Web == random access.  Data 
Warehouse == sequential access.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Josh Berkus

Amit,

> - We have lot of foreign keys between the tables

Do you need these keys to be enforced?   Last I checked, MySQL was still 
having trouble with foriegn keys.

> - Most of the DB usage is Selects. We would have some inserts but that
> would be like a nightly or a monthly process

So transaction integrity is not a real concern?   This sounds like a data 
warehouse; wanna try Bizgres?  (www.bizgres.org)

> Our only concern with going with postgres is speed. I haven't done a speed
> test yet so I can't speak. But the major concern is that the selects and
> inserts are going to be much much slower on postgres than on mysql. I dont
> know how true this is. I know this is a postgres forum so everyone will say
> postgres is better but I am just looking for some help and advise I guess

Well, the relative speed depends on what you're doing.   You want slow, try a 
transaction rollback on a large InnoDB table ;-)   PostgreSQL/Bizgres will 
also be implementing bitmapped indexes and table partitioning very soon, so 
we're liable to pull way ahead of MySQL on very large databases.

> I am not trying to start a mysql vs postgres war so please dont
> misunderstand me  I tried to look around for mysql vs postgres
> articles, but most of them said mysql is better in speed. 

Also I'll bet most of those articles were based on either website use or 
single-threaded simple-sql tests.   Not a read data warehousing situatiion.

It's been my personal experience that MySQL does not scale well beyond about 
75GB without extensive support from MySQL AB.   PostgreSQL more easily scales 
up to 200GB, and to as much as 1TB with tuning expertise.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Need help to decide Mysql vs Postgres

2005-05-24 Thread Josh Berkus

Amit,

> I took a look at this. I have a few concerns with bizgres though -- I am
> using jetspeed portal engine and Hibernate as my O/R Mapping layer. I know
> for sure that they dont support bizgres. Now the question is what
> difference is there between bizgres and postgres ... I guess I will try to
> look around the website more and find out, but if there is something you
> would like to comment, that would be very helpful ...

Bizgres is PostgreSQL.   Just a different packaging of it, with some patches 
which are not yet in the main PostgreSQL.   Also, it's currently beta.

--Josh

-- 
__Aglio Database Solutions___
Josh BerkusConsultant
josh@agliodbs.comwww.agliodbs.com
Ph: 415-752-2500Fax: 415-752-2387
2166 Hayes Suite 200San Francisco, CA

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Select performance vs. mssql

2005-05-24 Thread Josh Berkus

Folks,

> > This gets brought up a lot. The problem is that the
> > index doesn't include
> > information about whether the current transaction
> > can see the referenced
> > row. Putting this information in the index will add
> > significant overhead
> > to every update and the opinion of the developers is
> > that this would be
> > a net loss overall.

Pretty much.  There has been discussion about allowing index-only access to 
"frozen" tables, i.e. archive partitions.  But it all sort of hinges on 
someone implementing it and testing 

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Postgresql and xeon.

2005-05-30 Thread Josh Berkus

Eric,

> What about xeon and postgresql, i have been told that
> postgresql wouldn't perform as well when running
> under xeon processors due to some cache trick that postgresql
> uses?

Search the archives of this list.   This has been discussed ad nauseum.
www.pgsql.ru

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] Index not used on join with inherited tables

2005-05-30 Thread Josh Berkus

Sebastian,

> I'm having another problem with a query that takes to long, because
> the appropriate index is not used.

PostgreSQL is not currently able to push down join criteria into UNIONed 
subselects.   It's a TODO. 

Also, if you're using inherited tables, it's unnecessary to use UNION; just 
select from the parent.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Postgresql on an AMD64 machine

2005-06-07 Thread Josh Berkus

Tom,

> Obviously we'd be willing to do this work if there were convincing
> evidence it'd be worth the time.  A benchmark showing performance
> continuing to climb with increasing shared_buffers right up to the 2Gb
> limit would be reasonably convincing.  I think there is 0 chance of
> drawing such a graph with a pre-8.1 server, because of internal
> inefficiencies in the buffer manager ... but with CVS tip the story
> might be different.

Not that I've seen in testing so far.   Your improvements have, fortunately, 
eliminated the penalty for allocating too much shared buffers as far as I can 
tell (at least, allocating 70,000 when gains stopped at 15,000 doesn't seem 
to carry a penalty), but I don't see any progressive gain with increased 
buffers above the initial ideal.  In fact, with clock-sweep the shared_buffer 
curve is refreshingly flat once it reaches the required level, which will 
take a lot of the guesswork out of allocating buffers.

Regarding 2GB memory allocation, though, we *could* really use support for 
work_mem and maintenance_mem of > 2GB.   

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Resource Requirements

2005-06-13 Thread Josh Berkus

Elein,

> I've got a list of old resource requirements.
> I want to know how far off they are and if anything
> crucial is missing.  My usual recommendation is
> "as much as you can afford" so I don't usually deal
> with real numbers :)

These look very approximate.

> RAM:
> Number of connections * 2MB

That's not a bad recommendation, but not an actual requirement.   It really 
depends on how much sort_mem you need.   Could vary from 0.5mb to as much 
as 256mb per connection, depending on your application.

> Disk:
> Program and Manual 8-15MB
> Regression Tests 30MB
> Compiled Source 60-160MB

Well, my compiled source takes up 87mb, and the installed PostgreSQL seems 
to be about 41mb including WAL.  Not sure how much the regression tests 
are.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Needed: Simplified guide to optimal memory configuration

2005-06-15 Thread Josh Berkus

Dennis,

> http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html
>
> > NOTICE:  shared_buffers is 256

For everyone's info, the current (8.0) version is at:
http://www.powerpostgresql.com/PerfList

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] How does the transaction buffer work?

2005-06-16 Thread Josh Berkus

Veikko,

> One way of doing this that I thought of was start a
> transaction, delete everything and then just dump new data in (copy
> perhaps). The old data would be usable to other transactions until I
> commit my insert. This would be the fastest way, but how much memory
> would this use?

Starting a transaction doesn't use any more memory than without one.   
Unlike Some Other Databases, PostgreSQL's transactions occur in WAL and on 
data pages, not in RAM.

> Will this cause performance issues on a heavily loaded 
> server with too little memory even to begin with :)

Quite possibly, but the visibility issue won't be the problem.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Needed: Simplified guide to optimal memory

2005-06-17 Thread Josh Berkus

Todd,

> I'm going only on what my engineers are telling me, but they say
> upgrading breaks a lot of source code with some SQL commands that are
> a pain to hunt down and kill. Not sure if that's true, but that's
> what I'm told.

Depends on your app, but certainly that can be true.  Oddly, 7.2 -> 8.0 is 
less trouble than 7.2 -> 7.4 because of some type casting issues which were 
resolved.

Mind you, in the past a quick "sed" script has been adequate for me to fix 
compatibility issues.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] autovacuum suggestions for 500,000,000+ row tables?

2005-06-20 Thread Josh Berkus

Alex,

> Hi, i'm trying to optimise our autovacuum configuration so that it
> vacuums / analyzes some of our larger tables better. It has been set
> to the default settings for quite some time. We never delete
> anything  (well not often, and not much) from the tables, so I am not
> so worried about the VACUUM status, but I am wary of XID wraparound
> nuking us at some point if we don't sort vacuuming out so we VACUUM
> at least once every year ;) 

I personally don't use autovaccuum on very large databases.   For DW, 
vacuuming is far better tied to ETL operations or a clock schedule of 
downtime.

XID wraparound may be further away than you think.   Try checking 
pg_controldata, which will give you the current XID, and you can calculate 
how long you are away from wraparound.  I just tested a 200G data warehouse 
and figured out that we are 800 months away from wraparound, despite hourly 
ETL.

> However not running ANALYZE for such huge 
> periods of time is probably impacting the statistics accuracy
> somewhat, and I have seen some unusually slow queries at times.
> Anyway, does anyone think we might benefit from a more aggressive
> autovacuum configuration?

Hmmm, good point, you could use autovacuum for ANALYZE only.  Just set the 
VACUUM settings preposterously high (like 10x) so it never runs.   Then it'll 
run ANALYZE only.   I generally threshold 200, multiple 0.1x for analyze; 
that is, re-analyze after 200+10% of rows have changed.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [PERFORM] autovacuum suggestions for 500,000,000+ row tables?

2005-06-21 Thread Josh Berkus

Alex,

> Downtime is something I'd rather avoid if possible. Do you think we
> will need to run VACUUM FULL occasionally? I'd rather not lock tables
> up unless I cant avoid it. We can probably squeeze an automated
> vacuum tied to our data inserters every now and then though.

As long as your update/deletes are less than 10% of the table for all time, 
you should never have to vacuum, pending XID wraparound.

> Is this an 8.0 thing? I don't have a pg_controldata from what I can
> see. Thats nice to hear though.

'fraid so, yes.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[PERFORM] Configurator project launched

2005-06-21 Thread Josh Berkus

Folks,

OK, I've checked in my first code module and the configurator project is 
officially launched.  Come join us at 
www.pgfoundry.org/projects/configurator

Further communications will be on the Configurator mailing list only.

from the spec:

What is the Configurator, and Why do We Need It?
-

The Configurator is a set of Perl scripts and modules which allow users and
installation programs to write a reasonable postgresql.conf for PostgreSQL
performance based on the answers to some relatively simple questions.  Its
purpose is to provide an option between the poor-performing default
configuration, and the in-depth knowledge required for hand-tuning.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [PERFORM] Configurator project launched

2005-06-21 Thread Josh Berkus

Greg,

> Not sure how far along you are, but I've been writing some really nifty
> extensions to DBD::Pg that allow easy querying of all the current
> run-time settings. Could be very useful to this project, seems to me. If
> you're interested in possibly using it, let me know, I can bump it up on
> my todo list.

Um, can't we just get that from pg_settings?

Anyway, I'll be deriving settings from the .conf file, since most of the 
time the Configurator will be run on a new installation.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Josh Berkus

Frank,

>   I've put together a short article and posted it online regarding
>   performance tuning PostgreSQL in general.  I believe it helps to bring
>   together the info in a easy to digest manner. I would appreciate any
>   feedback, comments, and especially any technical corrections.

Looks nice. You should mark the link to the perf tips at Varlena.com as 
"PostgreSQL 7.4" and augment it with the current version here:
www.powerpostgresql.com/PerfList
as well as the Annotated .Conf File:
www.powerpostgresql.com/Docs

For my part, I've generally seen that SATA disks still suck for read-write 
applications.   I generally rate 1 UltraSCSI = 2 SATA disks for anything but 
a 99% read application.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] parameterized LIKE does not use index

2005-06-22 Thread Josh Berkus

Kurt,

> Of course, I could modify the application and send different SQL
> depending on which case we're in or just constructing a query with a
> literal each time, but is there a way to add a hint to the SQL that
> would cause the query to be re-planned if it's a case that could use the
> index?  Or can I convince the (Perl) driver to do so?

There should be an option to tell DBD::Pg not to cache a query plan.   
Let's see 

yes.  pg_server_prepare=0, passed to the prepare() call.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [PERFORM] parameterized LIKE does not use index

2005-06-23 Thread Josh Berkus

Bruno,

> I remember some discussion about delaying planning until the first
> actual query so that planning could use actual parameters to do
> the planning. If you really want to have it check the parameters
> every time, I think you will need to replan every time. I don't
> know if there is a way to save some of the prepare working while
> doing this.

That wouldn't help much in Kurt's case.Nor in most "real" cases, which is 
why I think the idea never went anywhere.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Poor index choice -- multiple indexes of the same columns

2005-06-27 Thread Josh Berkus

Karl,

> Seems to me that when there's a constant value in the query
> and an = comparision it will always be faster to use the (b-tree)
> index that's ordered first by the constant value, as then all further
> blocks are guarenteed to have a higher relevant information
> density.  At least when compared with another index that has the
> same columns in it.

That really depends on the stats.   Such a choice would *not* be 
appropriate if the < comparison was expected to return 1- rows while the = 
condition applied to 15% of the table.

What is your STATISTICS_TARGET for the relevant columns set to?   When's 
the last time you ran analyze?  If this is all updated, you want to post 
the pg_stats rows for the relevant columns?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] start time very high

2005-06-30 Thread Josh Berkus

Jean-Max,

> I have two computers, one laptop (1.5 GHz, 512 Mb RAM, 1 disk 4200)
> and one big Sun (8Gb RAM, 2 disks SCSI).

Did you run each query several times?   It looks like the index is cached 
on one server and not on the other.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Planner constants for RAM resident databases

2005-07-05 Thread Josh Berkus

Emil,

>   ->  Merge Left Join  (cost=9707.71..13993.52 rows=1276 width=161)
> (actual time=164.423..361.477 rows=49 loops=1)

That would indicate that you need to either increase your statistical 
sampling (SET STATISTICS) or your frequency of running ANALYZE, or both.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Data Warehousing Tuning

2005-07-06 Thread Josh Berkus

Paul,

> Has anyone experienced real performance gains by moving the pg_xlog
> files?

Yes.   Both for data load and on OLTP workloads, this increased write 
performance by as much as 15%.   However, you need to configure the xlog 
drive correctly, you can't just move it to a new disk.Make sure the 
other disk is dedicated exclusively to the xlog, set it forcedirectio, and 
increase your checkpoint_segments to something like 128.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Need suggestion high-level suggestion on how to solve a performance problem

2005-07-07 Thread Josh Berkus

Madison,

>    The problem comes when the user toggles a directory branch's backup
> flag (a simple check box beside the directory name). If it's a directory
> near the end of a branch it is fast enough. If they toggle a single file
> it is nearly instant. However if they toggle say the root directory, so
> every file and directory below it needs to be updated, it can take
> 500-600sec to return. Obviously this is no good.
>
>    What I need is a scheme for being able to say, essentially:
>
> UPDATE file_info_1 SET file_backup='t' WHERE file_parent_dir~'^/';

Well, from the sound of it the problem is not selecting the files to be 
updated, it's updating them.   

What I would do, personally, is *not* store an update flag for each file.  
Instead, I would store the update flag for the directory which was 
selected.  If users want to exclude certain files and subdirectories, I'd 
also include a dont_update flag.  When it's time to back up, you simply 
check the tree for the most immediate update or don't update flag above 
each file.

For the table itself, I'd consider using ltree for the directory tree 
structure.  It has some nice features which makes it siginifcanly better 
than using a delimited text field.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Mount database on RAM disk?

2005-07-08 Thread Josh Berkus

Stuart,

> I'm putting together a road map on how our systems can scale as our load
> increases. As part of this, I need to look into setting up some fast
> read only mirrors of our database. We should have more than enough RAM
> to fit everything into memory. I would like to find out if I could
> expect better performance by mounting the database from a RAM disk, or
> if I would be better off keeping that RAM free and increasing the
> effective_cache_size appropriately.

If you're accessing a dedicated, read-only system with a database small 
enough to fit in RAM, it'll all be cached there anyway, at least on Linux 
and BSD.   You won't be gaining anything by creating a ramdisk.

BTW, effective_cache_size doesn't determine the amount of caching done.  It 
just informs the planner about how much db is likely to be cached.  The 
actual caching is up to the OS/filesystem.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [PERFORM] What is best way to stream terabytes of data into postgresql?

2005-07-21 Thread Josh Berkus

Jeff,

> Streaming being the operative word.

Not sure how much hacking you want to do, but the TelegraphCQ project is 
based on PostgreSQL:
http://telegraph.cs.berkeley.edu/telegraphcq/v0.2/

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread Josh Berkus

Luke,

> Well - now that I test it, it appears you are correct, temp table COPY
> bypasses WAL - thanks for pointing it out!

RIght.  The problem is bypassing WAL for loading new "scratch" tables which 
aren't TEMPORARY tables.   We need to do this for multi-threaded ETL, since:
a) Temp tables can't be shared by several writers, and
b) you can't index a temp table.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

[PERFORM] wal_buffer tests in

2005-07-27 Thread Josh Berkus

Folks,

I ran a wal_buffer test series.   It appears that increasing the 
wal_buffers is indeed very important for OLTP applications, potentially 
resulting in as much as a 15% average increase in transaction processing.  
What's interesting is that this is not just true for 8.1, it's true for 
8.0.3 as well.   

More importantly, 8.1 performance is somehow back up to above-8.0 levels.  
Something was broken in June that's got fixed (this test series is based 
on July 3 CVS) but I don't know what.  Clues?

Test results are here:
http://pgfoundry.org/docman/view.php/141/79/wal_buffer_test.pdf

As always, detailed test results are available from OSDL, just use:
http://khack.osdl.org/stp/#
where # is the test number.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.

2005-07-27 Thread Josh Berkus

Matt,

> After playing with various indexes and what not I simply am unable to
> make this procedure perform any better.  Perhaps someone on the list can
> spot the bottleneck and reveal why this procedure isn't performing that
> well or ways to make it better.

Well, my first thought is that this is a pretty complicated procedure for 
something you want to peform well.Is all this logic really necessary?   
How does it get done for MySQL?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1148 matches

Mail list logo