I have this QUERY (below) that PostgreSQL 7.3.X servers run MUCH slower than the 7.1.3 server does. It makes sense that both servers have to do a sequential scan over the ZIPCODE column. There are over 7,500 rows in the LOCATIONS table.
Does anyone know what changed in the planner or optimizer? Can I change the postgresql.conf file to improve 7.3.3 performance?
Situation --------- Here is the situation...
PG 7.1.3 returns QUERY in about 4 seconds. The EXPLAIN plan says it uses the index on country.
PG 7.3.3 simply does not return QUERY. I've waited over 3 minutes. With the extra condition WHERE STATE = 'NJ' it takes almost 5 seconds. Other states are much worse.
QUERY ----- SELECT ZIPCODE FROM LOCATIONS WHERE COUNTRY = 'USA' AND ZIP_DIST_MI('07306', ZIPCODE) <= 25;
The function is written in C, using SPI. Given two US ZIP codes, it returns the distance in miles. For example, it is 78 miles from Jersey City to Philadelphia:
db=> select ZIP_DIST_MI('07306', '19130');
zip_dist_mi -----------------
78.801595557406
(1 row)
ZIP_DIST_MI() uses a geo_zipdata table to get the latitude and longitude. Using those, it can calculate the "great circle distance" between ZIPs with C double arithmetic. It finds the ZIPs locations with a prepared (and saved) SPI query, which uses an index:
"select latitude, longitude from geo_zipdata where zip = $1"
FUNCTION -------- CREATE FUNCTION ZIP_DIST_MI(TEXT, TEXT) RETURNS DOUBLE PRECISION...
ZIP DATA TABLE -------------- CREATE TABLE GEO_ZIPDATA ( ZIP VARCHAR(5) NOT NULL, STATE VARCHAR(2) NOT NULL, CITY VARCHAR(64) NOT NULL, COUNTY VARCHAR(64) NOT NULL, LATITUDE FLOAT NOT NULL, LONGITUDE FLOAT NOT NULL, FIPS NUMERIC(10) NOT NULL ); CREATE INDEX GEO_ZIPDATA_ZIP_IDX ON GEO_ZIPDATA (ZIP);
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match