The intuitive thing would be to put pg into a file system.
/Aaron
On Thu, 21 Oct 2004 12:44:10 +0200, Leeuw van der, Tim
[EMAIL PROTECTED] wrote:
Hi,
I guess the difference is in 'severe hacking inside PG' vs. 'some unknown amount of
hacking that doesn't touch PG code'.
Hacking PG
Sounds like you need some way to match a subset of the data first,
rather than try indices that are bigger than the data. Can you add
operation indices, perhaps on the first 10 bytes of the keys in both
tables or on a integer hash of all of the strings? If so you could
join on the exact set
I'm driving from Tenafly NJ and going to both sessions. If you're able
to get to the George Washington Bridge (A train to 178th Street [Port
Authority North] and a bus over the bridge), I can drive you down. I'm
not sure right now about the return because I have confused plans to
meet someone.
Right - if you split a table to a lot of more selective tables, it can often
dramatically change the plan options (e.g. - in a single table, selectivity
for a query may be 1% and require an expensive nested loop while in the more
restrictive table it may match 14% of the data and do a cheaper
Makes sense. See DB2 8.2 info on their new implementation of cross column
statistics. If this is common and you're willing to change code, you can
fake that by adding a operation index on some hash function of both columns,
and search for both columns and the hash.
- Original Message -
Consider how the fact table is going to be used, and review hacking it up
based on usage. Fact tables should be fairly narrow, so if there are extra
columns beyond keys and dimension keys consider breaking it into parallel
tables (vertical partitioning).
Horizontal partitioning is your friend;
.)
- Original Message -
From: Jeff [EMAIL PROTECTED]
To: Mitch Pirtle [EMAIL PROTECTED]
Cc: Aaron Werman [EMAIL PROTECTED]; Scott Kirkwood
[EMAIL PROTECTED]; Neil Conway [EMAIL PROTECTED];
[EMAIL PROTECTED]; Tom Lane [EMAIL PROTECTED]
Sent: Monday, September 27, 2004 2:25 PM
Subject: Re: [PERFORM
Some quick notes:
- Using a side effect of a function to update the
database feels bad to me
- how long does theSELECT into varQueryRecord
md5(upc.keyp
function take / what does it's explain look
like?
- There are a lot of non-indexed columns on that
delta master table, such as
I'm not sure I understand your req fully. If the same request is repeatedly
done with same parameters, you could implement a proxy web server with a
croned script to purge stale pages. If there is substantially the same data
being summarized, doing your own summary tables works; if accessed
- Original Message -
From: Tom Lane [EMAIL PROTECTED]
To: Aaron Werman [EMAIL PROTECTED]
Cc: Iain [EMAIL PROTECTED]; Jim C. Nasby [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Tuesday, September 28, 2004 9:58 AM
Subject: Re: [PERFORM] Caching of Queries
Aaron Werman [EMAIL PROTECTED
Mark Cotner wrote:
The time has come to reevaluate/rearchitect an
application which I built about 3 years ago. There
are no performance concerns with MySQL, but it would
benefit greatly from stored procedures, views, etc.
From: Mischa Sandberg [EMAIL PROTECTED]
If your company is
From: Harald Lau (Sector-X) [EMAIL PROTECTED]
...
From: Mischa Sandberg [EMAIL PROTECTED]
If your company is currently happy with MySQL, there probably are
other (nontechnical) reasons to stick with it. I'm impressed that
you'd consider reconsidering PG.
I'd like to second Mischa on that
By definition, it is equivalent to:
SELECT t1.id, t2.url FROM referral_temp t2 LEFT /*OUTER*/ JOIN d_referral t1
ON t2.url = t1.referral_raw_url
union all
SELECT null, url FROM referral_temp WHERE url is null
ORDER BY 1;
/Aaron
- Original Message -
From: Joe Conway [EMAIL PROTECTED]
Your second server has queuing (load averages are highish), only 2 processes
running, and almost all cycles are idle. You need to track down your
bottleneck. Have you looked at iostat/vmstat? I think it would be useful to
post these, ideally both before and after full vacuum analyze.
/Aaron
There are a few things that you can do to help force yourself to be I/O
bound. These include:
- RAID 5 for write intensive applications, since multiple writes per synch
write is good. (There is a special case for logging or other streaming
sequential writes on RAID 5)
- Data journaling file
I agree on not linking and adding non-SAN disk
dependancy to your DB. I'm trying to understand your FS reasoning. I have never seen XFS run faster than ReiserFS in any
situation (or for that matter beat any FS in performance except JFS). XFS has
some nifty very large file features, but we're
fsync I'm thinking 50 inserts, if autocommiting is 50TPS = ~100 IO per
second (50 WAL + checkpoint IO) = roughly the I/O rate of a single drive.
Huang - Are you using a single drive for pg? If so, there is a safety
problem of both the data and logs used for recovery on the same drive. If
the
- Original Message -
From: Tom Lane [EMAIL PROTECTED]
To: Qing Zhao [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, April 06, 2004 1:47 AM
Subject: Re: [PERFORM] possible improvement between G4 and G5
Qing Zhao [EMAIL PROTECTED] writes:
We have got a G5 64-bit processor to
- Original Message -
From: Josh Berkus [EMAIL PROTECTED]
To: Aaron Werman [EMAIL PROTECTED]; Qing Zhao [EMAIL PROTECTED];
Tom Lane [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Tuesday, April 06, 2004 2:52 PM
Subject: Re: [PERFORM] possible improvement between G4 and G5
Aaron,
I'm
: Sunday, April 04, 2004 5:06 PM
Subject: Re: [PERFORM] single index on more than two coulumns a bad thing?
Hi Aaron,
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of
Aaron Werman
Sent: vrijdag 2 april 2004 13:57
another thing that I have all over
Almost any cross dbms migration shows a drop in performance. The engine
effectively trains developers and administrators in what works and what
doesn't. The initial migration thus compares a tuned to an untuned version.
/Aaron
- Original Message -
From: Josh Berkus [EMAIL PROTECTED]
To:
another thing that I have all over the place is a hierarchy:
index on grandfather_table(grandfather)
index on father_table(grandfather, father)
index on son_table(grandfather, father, son)
almost all of my indices are composite. Are you thinking about composite
indices with low cardinality
Are you talking about
http://www.potentialtech.com/wmoran/postgresql.php#conclusion
- Original Message -
From: Subbiah, Stalin [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; Matt Clark [EMAIL PROTECTED]; Subbiah, Stalin
[EMAIL PROTECTED]; 'Andrew Sullivan' [EMAIL PROTECTED];
[EMAIL PROTECTED]
Bulk updates are generally dogs (not just in pg), so I avoid doing them by
doing faster selects and inserts. You can create a new table using 'create
table as' to produce your target results. This is real fast - avoiding the
row iteration in insert, allowing the select optimizer to run and no
The original point was about a very slow update of an entire table with a
plan that looped, and over a dozen conditional indices - vs. a 'create as'
in a CPU starved environment. I stand by my statement about observing the
orders of magnitude difference. In theory I agree that the update should be
Sorry about not chiming in before - I've been too swamped to think. I agree
with most of the points, but a lot of these posts are interesting and seem
to describe systems from an SA perspective to my DBA-centric view.
- Original Message -
From: Marty Scholes [EMAIL PROTECTED]
To: [EMAIL
26 matches
Mail list logo