That is on our to do list. We had made that change while running on BSD 5.1on a Dell 2650 with 4GB and 5 10K SCSI drive in RAID 0. Did not see a huge improvement.Sean Shanny7
Hardware: Apple G5 dual 2.0 with 8GB memory attached via dual fibre
channel to a fully loaded 3.5TB XRaid. The XRaid is configured as two
disk hardware based RAID5 sets software striped to form a RAID50 set.file
The DB, WALS, etc are all on that file set. Running OSX journaled
system Running postgres 7.4.1. OSX Server 10.3.2 Postgres iscompiled
locally with '--enable-recode' '--enable-multibyte=UNICODE'
'CFLAGS=-mcpu=970 -mtune=970 -mpowerpc64 -O3'
Have you tried altering the blocksize to a higher value? Say 32K?
Have already addressed that by lowering this to 50. Will drop it lower as time goes on.
max_connections = 100
Why have you set this to 100 when you have typically 1-3 users?
We create an MD5 of the URL and store it as referral_md5. This is our key for lookup. We ran into problems with the URL as the index. The postgres indexing code was complaining about the URL being too long, hence the MD5 which thought longer to compute during the ETL phase is much quicker to match on.
sort_mem = 256000 # min 64, size in KB
If you have only 1-3 users, then that value seems reasonable.
The query is
SELECT t1.id, t2.md5, t2.url FROM referral_temp t2 LEFT OUTER JOIN d_referral t1 ON t2.md5 = t1.referral_md5;
\d d_referral id | integer | not null referral_md5 | text | not null referral_raw_url | text | not null referral_host | text | referral_path | text | referral_query | text | job_control_number | integer | not null
\d referral_temp
md5 | text |
url | text |
Have you looked at using reversed indexes, as per recent postings in
[performance]? These seemed to help considerably with lookup speed when
using a large URL database, which seems to be your situation here.
...Have not considered that.
Jeff Boes writestime
We have a large (several million row) table with a field containing
URLs. Now, funny thing about URLs: they mostly start with a common
substring ("http://www."). But not all the rows start with this, so we
can't just lop off the first N characters. However, we noticed some
ago that an index on this field wasn't as effective as an index on the REVERSE of the field. So ...
CREATE OR REPLACE FUNCTION fn_urlrev(text) returns text as ' return reverse(lc($_[0])) ' language 'plperl' with (iscachable,isstrict);
and then
CREATE UNIQUE INDEX ix_links_3 ON links
(fn_urlrev(path_base));
You have 2 CPUs: have you tried splitting your input data file into two
tables, then executing the same query simultaneously, to split the
processing? If you get the correct plan, you should use roughly the same
I/O but use all of the available CPU power.
I'm sure we'd all be interested in your further results!I will post things as I discover them.
--sean
Best Regards, Simon Riggs
---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend