You have 16 GB of mem so set your shared_buffer=12GB
This will probably fail to start because your kernel may not have SHMMAX
set high enough, so you will need to set it higher:
You can see the current settings with:
ipcs -l
You can change it by editing:
/etc/sysctl.conf
then reload the new paramters with:
sysctl -p
and check it again with:
ipcs -l
and finally restart postgres
/etc/init.d/postgresql-<version> restart
and then try it again. You might need to process a few queries for it to
load up the page cache after you restart before you start seeing any
performance increases.
Also you should sort you addresses by zipcode before you geocode them so
that the data you need is in the page cache from the last record you
geocoded.
-Steve
On 12/13/2011 11:02 PM, Ravi Ada wrote:
Thanks Steve.
Here are the values from the postgresql.conf
max_connections = 140
shared_buffers = 2GB
temp_buffers = 8MB
work_mem = 4GB
maintenance_work_mem = 1GB
wal_buffers = 8MB
checkpoint_segments = 128
effective_cache_size = 6GB
cpu_tuple_cost = 0.0030
cpu_index_tuple_cost = 0.0010
cpu_operator_cost = 0.0005
fsync = off
checkpoint_timeout = 1h
I just don't understand why the geocode function takes so long to return the
coordinates. I am sure some of you on this list might have done the batch
geocoding millions of addresses. I may be missing just a simple configuration
which might make a whole lot of difference in the speed. I don't know what it
is. I am following the examples exactly from this link
(http://postgis.refractions.net/documentation/manual-svn/Geocode.html)
If someone is familiar with the problem willing to help me using GoTo Meeting
connection to my machine, I can arrange that too. I just have to move along
with my project and meet the deadlines. I am already delayed, everybody in my
team asking me for this everyday.
Thanks,
Ravi Ada
On Tue, 13 Dec 2011 21:25:54 -0500, Stephen Woodbridge wrote
On 12/13/2011 8:33 PM, Greg Williamson wrote:
Have you run "analyze" recently on this table ? (since the last index
build or the last major change in data)
The work_mem setting is fairly meaningless for this -- it applies
when building indexes and the like; sort_mem controls how much RAM
the system will try to use before it starts using disk; you might try
tinkering with that unless it is already large (but remember that
each sort in a query uses this much RAM so too aggressive a setting
is bad).
What is:
shared_buffers = ...
set to? this is the one that needs to be set. You should google:
"postgresql tuning" and read how to set the postgresql.conf file.
You may also need to change your kernel parameters as you increase
the shared_buffers.
setting work_mem to a large value will not help much. You need to
read what each of the parameters do and then set them appropriately.
-Steve
HTH,
Greg W.
________________________________ From: Ravi
Ada<ravi...@dobeyond.com> To: Andy Colson<a...@squeakycode.net>;
PostGIS Users Discussion<postgis-users@postgis.refractions.net>
Sent: Tuesday, December 13, 2011 3:28 PM Subject: Re:
[postgis-users] Tigerdata for AZ, AS and VI
Andy, Here is the explain analyze output. "Limit
(cost=0.00..14.10 rows=100 width=73) (actual
time=4824.392..98929.180
rows=100
loops=1)"
" -> Index Scan using geo_biz_addr_zip_idx on geo_biz_addr
ag
(cost=0.00..219048.99 rows=1553779 width=73) (actual
time=4824.381..98925.304
rows=100
loops=1)"
" Filter: (rating IS
NULL)"
"Total runtime: 98930.371
ms"
Here is the output for the query without ORDER BY zip. "Limit
(cost=0.00..7.06 rows=100 width=73) (actual
time=63022.583..279475.286
rows=100
loops=1)"
" -> Seq Scan on geo_biz_addr ag (cost=0.00..109741.62
rows=1553779
width=73) (actual time=63022.571..279474.529 rows=100
loops=1)"
" Filter: (rating IS
NULL)"
"Total runtime: 279475.678
ms"
Surprisingly it took longer without the where clause, that may be
because the
addresses are scattered around all the states or cities. but in any
case, 100 to 300 secs to
geocode 100 addresses is too long. I got the work_mem to set to 4GB
in postgresql.conf.
Thanks Ravi Ada On Tue, 13 Dec 2011 14:31:34 -0600, Andy Colson
wrote
And instead of running the update, try
running:
explain
analyze
SELECT
ag.id,
(geocode(ag.address1||','||ag.city||','||ag.state||','||ag.zip))
As
geo
FROM qliq.geo_biz_addr As
ag
WHERE ag.rating IS
NULL
ORDER BY
zip
LIMIT
100
Also, the order by zip, combined with the limit, means it has to
pull every record, then sort by zip, then pull the first 100.
If you can drop one or the other it would run
faster.
-Andy
On 12/13/2011 12:37 PM, Greg Williamson
wrote:
Ravi
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
Thanks,
Ravi Ada
918-630-7381
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users