Right, This is happening in a stored procedure that is doing some
additional work, but my point was, that the in the 200-400ms case you
are measuring the the disk hit performance as a page is getting located
on disk and paged into cache and at the faster 20-40ms you are getting
just the index speed, the page look up in memory and record access and
whatever additional execution I'm doing in the stored procedure.
The 20-40ms performance is on average when processing a few 100,000
requests randomized over the coverage area so the actual faster speeds
are probably closer to 5-8ms and average out to the 20-40ms when you
cost in the out of cache page hits.
Sorry for provide less than complete information originally.
-Steve
On 5/29/2011 1:14 PM, Johnathan Leppert wrote:
30 million records isn't really a lot with an index and shouldn't be taking
quite that long?
Johnathan
On May 29, 2011, at 12:28 AM, Stephen Woodbridge<[email protected]>
wrote:
I have had similar performance experiences working with tiger data in other
applications, ie not this geocoder, where queries cost about 200-400ms
initially querying a 30 million record streets table and then go to 20-40ms
afterwards. I have always attributed this to page caching. My queries are
typically spatial in nature and I cluster my data based on the spatial index.
But for the geocoder, I would expect similar performance if you cluster your
data by zipcode and then sort your input data by zipcode, you should get very
good performance depending on your queries and indexes.
-Steve
On 5/28/2011 5:24 PM, Paragon Corporation wrote:
Mikal,
Can you send me the change you made and the indexes you added. When
adding some more data, I realized I had hardcoded an index for our local
state (MA) and I know without that index that that particular query does
run pretty slow. So just wondering if its along the same lines.
I've also fixed I think all the issues with running the loader on
Unix/Linux -- well at least I was able to get it to run on my CentOS.
Thanks to all who contributed input to that. I took bits and pieces from
many people's comments but couldn't apply a full diff from anyones since
I had already changed the code too much to safely apply any of those
patches.
How many states do you have loaded BTW? I just have CA loaded on my
CentOS -- which is an 8GB/8 core cloud server. I'm getting around 38ms -
450ms per test, but I have yet to load the other states.
It also seems to cache very well so that if I geocode an address on the
same street (not necessarily same address), the first call might take
450ms and the second 38ms. I suspect this might be because I also marked
a good chunk of the functions STABLE or IMMUTABLE.
Thanks,
Regina
http://www.postgis.us
*From:* [email protected]
[mailto:[email protected]] *On Behalf Of
*Mikal Laster
*Sent:* Friday, May 27, 2011 8:18 AM
*To:* [email protected]
*Subject:* [postgis-users] Fw: re: Geocoder (from extras)
--- On *Fri, 5/27/11, Mikal Laster /<[email protected]>/* wrote:
From: Mikal Laster<[email protected]>
Subject: re:[postgis-users] Geocoder (from extras)
To: [email protected]
Date: Friday, May 27, 2011, 7:22 AM
in response to
http://postgis.refractions.net/pipermail/postgis-users/2011-May/029566.html.
After creating some indexes and rewriting geocode_address. I was
able to get geocode to run in 483-523 ms for "5775 Perimeter Dr
Dublin, Ohio". This used to take 1700-2000 ms for me. I'm removing
the main inner qui
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
http://postgis.refractions.net/mailman/listinfo/postgis-users