Re: [postgis-users] Tigerdata for AZ, AS and VI

Stephen Woodbridge Tue, 13 Dec 2011 20:56:13 -0800

You have 16 GB of mem so set your shared_buffer=12GB

This will probably fail to start because your kernel may not have SHMMAXset high enough, so you will need to set it higher:


You can see the current settings with:
ipcs -l

You can change it by editing:
/etc/sysctl.conf

then reload the new paramters with:
sysctl -p

and check it again with:
ipcs -l

and finally restart postgres
/etc/init.d/postgresql-<version> restart

and then try it again. You might need to process a few queries for it toload up the page cache after you restart before you start seeing anyperformance increases.

Also you should sort you addresses by zipcode before you geocode them sothat the data you need is in the page cache from the last record yougeocoded.


-Steve

On 12/13/2011 11:02 PM, Ravi Ada wrote:

Thanks Steve.

Here are the values from the postgresql.conf

max_connections = 140
shared_buffers = 2GB
temp_buffers = 8MB
work_mem = 4GB
maintenance_work_mem = 1GB
wal_buffers = 8MB
checkpoint_segments = 128
effective_cache_size = 6GB
cpu_tuple_cost = 0.0030
cpu_index_tuple_cost = 0.0010
cpu_operator_cost = 0.0005
fsync = off
checkpoint_timeout = 1h

I just don't understand why the geocode function takes so long to return the
coordinates. I am sure some of you on this list might have done the batch
geocoding millions of addresses. I may be missing just a simple configuration
which might make a whole lot of difference in the speed. I don't know what it
is. I am following the examples exactly from this link
(http://postgis.refractions.net/documentation/manual-svn/Geocode.html)

If someone is familiar with the problem willing to help me using GoTo Meeting
connection to my machine, I can arrange that too. I just have to move along
with my project and meet the deadlines. I am already delayed, everybody in my
team asking me for this everyday.


Thanks,
Ravi Ada



On Tue, 13 Dec 2011 21:25:54 -0500, Stephen Woodbridge wrote

On 12/13/2011 8:33 PM, Greg Williamson wrote:

Have you run "analyze" recently on this table ? (since the last index
build or the last major change in data)

The work_mem setting is fairly meaningless for this -- it applies
when building indexes and the like; sort_mem controls how much RAM
the system will try to use before it starts using disk; you might try
tinkering with that unless it is already large (but remember that
each sort in a query uses this much RAM so too aggressive a setting
is bad).


What is:
    shared_buffers = ...
set to? this is the one that needs to be set. You should google:
"postgresql tuning" and read how to set the postgresql.conf file.
You may also need to change your kernel parameters as you increase
the shared_buffers.

setting work_mem to a large value will not help much. You need to
read what each of the parameters do and then set them appropriately.

-Steve

HTH,

Greg W.

________________________________ From: Ravi
Ada<ravi...@dobeyond.com>  To: Andy Colson<a...@squeakycode.net>;
PostGIS Users Discussion<postgis-users@postgis.refractions.net>
Sent: Tuesday, December 13, 2011 3:28 PM Subject: Re:
[postgis-users] Tigerdata for AZ, AS and VI


Andy, Here is the explain analyze  output. "Limit
(cost=0.00..14.10 rows=100 width=73) (actual


time=4824.392..98929.180

rows=100


loops=1)"

"  ->     Index Scan using geo_biz_addr_zip_idx on geo_biz_addr

ag

(cost=0.00..219048.99 rows=1553779 width=73) (actual


time=4824.381..98925.304

rows=100


loops=1)"

"        Filter: (rating IS


NULL)"

"Total runtime: 98930.371

ms"



Here is the output for the query without ORDER BY zip. "Limit
(cost=0.00..7.06 rows=100 width=73) (actual


time=63022.583..279475.286

rows=100


loops=1)"

"  ->     Seq Scan on geo_biz_addr ag  (cost=0.00..109741.62


rows=1553779

width=73) (actual time=63022.571..279474.529 rows=100


loops=1)"

"        Filter: (rating IS


NULL)"

"Total runtime: 279475.678

ms"


Surprisingly it took longer without the where clause, that may be
because the

addresses are scattered around all the states or cities. but in any
case, 100 to 300 secs to

geocode 100 addresses is too long. I got the work_mem to set to 4GB
in postgresql.conf.



Thanks Ravi Ada On Tue, 13 Dec 2011 14:31:34 -0600, Andy Colson

wrote

And instead of running the update, try

running:


explain

analyze

SELECT

ag.id,


(geocode(ag.address1||','||ag.city||','||ag.state||','||ag.zip))
As

geo

FROM qliq.geo_biz_addr As

ag

WHERE ag.rating IS

NULL

ORDER BY

zip

LIMIT


Also, the order by zip, combined with the limit, means it has to
pull every record, then sort by zip, then pull the first 100.
If you can drop one or the other it would run

faster.

-Andy


On 12/13/2011 12:37 PM, Greg Williamson

wrote:

Ravi


_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users



Thanks,
Ravi Ada
918-630-7381

_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users


_______________________________________________
postgis-users mailing list
postgis-users@postgis.refractions.net
http://postgis.refractions.net/mailman/listinfo/postgis-users

Re: [postgis-users] Tigerdata for AZ, AS and VI

Reply via email to