Re: [PERFORM] Anything to be gained from a 'Postgres Filesystem'?

2004-10-21 Thread Aaron Werman
The intuitive thing would be to put pg into a file system. /Aaron On Thu, 21 Oct 2004 12:44:10 +0200, Leeuw van der, Tim [EMAIL PROTECTED] wrote: Hi, I guess the difference is in 'severe hacking inside PG' vs. 'some unknown amount of hacking that doesn't touch PG code'. Hacking PG

Re: [PERFORM] Simple machine-killing query!

2004-10-21 Thread Aaron Werman
Sounds like you need some way to match a subset of the data first, rather than try indices that are bigger than the data. Can you add operation indices, perhaps on the first 10 bytes of the keys in both tables or on a integer hash of all of the strings? If so you could join on the exact set

Re: [PERFORM] Free PostgreSQL Training, Philadelphia, Oct 30

2004-10-20 Thread Aaron Werman
I'm driving from Tenafly NJ and going to both sessions. If you're able to get to the George Washington Bridge (A train to 178th Street [Port Authority North] and a bus over the bridge), I can drive you down. I'm not sure right now about the return because I have confused plans to meet someone.

Re: [PERFORM] Performance vs Schemas

2004-10-19 Thread Aaron Werman
Right - if you split a table to a lot of more selective tables, it can often dramatically change the plan options (e.g. - in a single table, selectivity for a query may be 1% and require an expensive nested loop while in the more restrictive table it may match 14% of the data and do a cheaper

Re: [PERFORM] Normal case or bad query plan?

2004-10-12 Thread Aaron Werman
Makes sense. See DB2 8.2 info on their new implementation of cross column statistics. If this is common and you're willing to change code, you can fake that by adding a operation index on some hash function of both columns, and search for both columns and the hash. - Original Message -

Re: [PERFORM] Data warehousing requirements

2004-10-07 Thread Aaron Werman
Consider how the fact table is going to be used, and review hacking it up based on usage. Fact tables should be fairly narrow, so if there are extra columns beyond keys and dimension keys consider breaking it into parallel tables (vertical partitioning). Horizontal partitioning is your friend;

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
.) - Original Message - From: Jeff [EMAIL PROTECTED] To: Mitch Pirtle [EMAIL PROTECTED] Cc: Aaron Werman [EMAIL PROTECTED]; Scott Kirkwood [EMAIL PROTECTED]; Neil Conway [EMAIL PROTECTED]; [EMAIL PROTECTED]; Tom Lane [EMAIL PROTECTED] Sent: Monday, September 27, 2004 2:25 PM Subject: Re: [PERFORM

Re: [PERFORM] Slow update/insert process

2004-10-01 Thread Aaron Werman
Some quick notes: - Using a side effect of a function to update the database feels bad to me - how long does theSELECT into varQueryRecord md5(upc.keyp function take / what does it's explain look like? - There are a lot of non-indexed columns on that delta master table, such as

Re: [PERFORM] Caching of Queries

2004-10-01 Thread Aaron Werman
I'm not sure I understand your req fully. If the same request is repeatedly done with same parameters, you could implement a proxy web server with a croned script to purge stale pages. If there is substantially the same data being summarized, doing your own summary tables works; if accessed

Re: [PERFORM] Caching of Queries

2004-09-28 Thread Aaron Werman
- Original Message - From: Tom Lane [EMAIL PROTECTED] To: Aaron Werman [EMAIL PROTECTED] Cc: Iain [EMAIL PROTECTED]; Jim C. Nasby [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Tuesday, September 28, 2004 9:58 AM Subject: Re: [PERFORM] Caching of Queries Aaron Werman [EMAIL PROTECTED

Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread Aaron Werman
Mark Cotner wrote: The time has come to reevaluate/rearchitect an application which I built about 3 years ago. There are no performance concerns with MySQL, but it would benefit greatly from stored procedures, views, etc. From: Mischa Sandberg [EMAIL PROTECTED] If your company is

Re: [PERFORM] Data Warehouse Reevaluation - MySQL vs Postgres -- merge tables

2004-09-14 Thread aaron werman
From: Harald Lau (Sector-X) [EMAIL PROTECTED] ... From: Mischa Sandberg [EMAIL PROTECTED] If your company is currently happy with MySQL, there probably are other (nontechnical) reasons to stick with it. I'm impressed that you'd consider reconsidering PG. I'd like to second Mischa on that

Re: [PERFORM] Looking for ideas on how to speed up warehouse loading

2004-04-23 Thread Aaron Werman
By definition, it is equivalent to: SELECT t1.id, t2.url FROM referral_temp t2 LEFT /*OUTER*/ JOIN d_referral t1 ON t2.url = t1.referral_raw_url union all SELECT null, url FROM referral_temp WHERE url is null ORDER BY 1; /Aaron - Original Message - From: Joe Conway [EMAIL PROTECTED]

Re: [PERFORM] Help with performance problems

2004-04-23 Thread Aaron Werman
Your second server has queuing (load averages are highish), only 2 processes running, and almost all cycles are idle. You need to track down your bottleneck. Have you looked at iostat/vmstat? I think it would be useful to post these, ideally both before and after full vacuum analyze. /Aaron

Re: [PERFORM] possible improvement between G4 and G5

2004-04-20 Thread Aaron Werman
There are a few things that you can do to help force yourself to be I/O bound. These include: - RAID 5 for write intensive applications, since multiple writes per synch write is good. (There is a special case for logging or other streaming sequential writes on RAID 5) - Data journaling file

Re: [PERFORM] Moving postgres to FC disks

2004-04-20 Thread Aaron Werman
I agree on not linking and adding non-SAN disk dependancy to your DB. I'm trying to understand your FS reasoning. I have never seen XFS run faster than ReiserFS in any situation (or for that matter beat any FS in performance except JFS). XFS has some nifty very large file features, but we're

Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Aaron Werman
fsync I'm thinking 50 inserts, if autocommiting is 50TPS = ~100 IO per second (50 WAL + checkpoint IO) = roughly the I/O rate of a single drive. Huang - Are you using a single drive for pg? If so, there is a safety problem of both the data and logs used for recovery on the same drive. If the

Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman
- Original Message - From: Tom Lane [EMAIL PROTECTED] To: Qing Zhao [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, April 06, 2004 1:47 AM Subject: Re: [PERFORM] possible improvement between G4 and G5 Qing Zhao [EMAIL PROTECTED] writes: We have got a G5 64-bit processor to

Re: [PERFORM] possible improvement between G4 and G5

2004-04-06 Thread Aaron Werman
- Original Message - From: Josh Berkus [EMAIL PROTECTED] To: Aaron Werman [EMAIL PROTECTED]; Qing Zhao [EMAIL PROTECTED]; Tom Lane [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Sent: Tuesday, April 06, 2004 2:52 PM Subject: Re: [PERFORM] possible improvement between G4 and G5 Aaron, I'm

Re: [PERFORM] single index on more than two coulumns a bad thing?

2004-04-05 Thread Aaron Werman
: Sunday, April 04, 2004 5:06 PM Subject: Re: [PERFORM] single index on more than two coulumns a bad thing? Hi Aaron, -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Aaron Werman Sent: vrijdag 2 april 2004 13:57 another thing that I have all over

Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-03 Thread Aaron Werman
Almost any cross dbms migration shows a drop in performance. The engine effectively trains developers and administrators in what works and what doesn't. The initial migration thus compares a tuned to an untuned version. /Aaron - Original Message - From: Josh Berkus [EMAIL PROTECTED] To:

Re: [PERFORM] single index on more than two coulumns a bad thing?

2004-04-02 Thread Aaron Werman
another thing that I have all over the place is a hierarchy: index on grandfather_table(grandfather) index on father_table(grandfather, father) index on son_table(grandfather, father, son) almost all of my indices are composite. Are you thinking about composite indices with low cardinality

Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Aaron Werman
Are you talking about http://www.potentialtech.com/wmoran/postgresql.php#conclusion - Original Message - From: Subbiah, Stalin [EMAIL PROTECTED] To: [EMAIL PROTECTED]; Matt Clark [EMAIL PROTECTED]; Subbiah, Stalin [EMAIL PROTECTED]; 'Andrew Sullivan' [EMAIL PROTECTED]; [EMAIL PROTECTED]

Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
Bulk updates are generally dogs (not just in pg), so I avoid doing them by doing faster selects and inserts. You can create a new table using 'create table as' to produce your target results. This is real fast - avoiding the row iteration in insert, allowing the select optimizer to run and no

Re: [PERFORM] atrocious update performance

2004-03-15 Thread Aaron Werman
The original point was about a very slow update of an entire table with a plan that looped, and over a dozen conditional indices - vs. a 'create as' in a CPU starved environment. I stand by my statement about observing the orders of magnitude difference. In theory I agree that the update should be

Re: [PERFORM] Scaling further up

2004-03-14 Thread Aaron Werman
Sorry about not chiming in before - I've been too swamped to think. I agree with most of the points, but a lot of these posts are interesting and seem to describe systems from an SA perspective to my DBA-centric view. - Original Message - From: Marty Scholes [EMAIL PROTECTED] To: [EMAIL