Ravi, I thought I optimized those, but I may have missed something. The best way to see what is going on is to use
normalize_address to see how its reading the values. Anyrate -- please put in a bug ticket for this with some examples you are trying and how long its taking. http://trac.osgeo.org/postgis/newticket Make sure to set the component to "tiger geocoder" in the ticket so it gets assigned to me. Thanks, Regina http://www.postgis.us > -----Original Message----- > From: postgis-users-boun...@postgis.refractions.net > [mailto:postgis-users-boun...@postgis.refractions.net] On > Behalf Of Ravi ada > Sent: Thursday, December 15, 2011 12:35 PM > To: 'PostGIS Users Discussion' > Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI > > Thank you. I am able to speed up a bit by launching the query > for each state and doing 6 states at a time. I am able to see > all processors are being used with a 80-85% memory > utilization. However I noticed that geocode function is > taking forever to return for some addresses that are like > '100 29TH EAST ST.' or 'I-35 HIGHWAY'. Some states have the > convention of using numbers for the street names, it takes > forever to geocode these addresses. Numbered streets and > Highway service roads are the one taking the most time. I > wonder why? Any ideas what we need to speed these up? > > Thanks, > Ravi Ada > > -----Original Message----- > From: postgis-users-boun...@postgis.refractions.net > [mailto:postgis-users-boun...@postgis.refractions.net] On > Behalf Of Stephen Woodbridge > Sent: Wednesday, December 14, 2011 1:30 PM > To: postgis-users@postgis.refractions.net > Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI > > On 12/14/2011 10:12 AM, Ravi ada wrote: > > Thanks Steve. > > I did not get when you said '2. normalize the names as you load the > > data', is this the step we need to do manually or load scripts > > automatically do this step? > > the load scripts do this automatically. > > > I am normalizing my addresses before querying the reference data > > (tiger data). Please clarify. > > You should use geocode() not geocode_address(), because > geocode() will normalize the address in the same way the the > reference addresses are normalized. If you split the address > into fields and call > geocode_address() this is NOT the same as normalizing the > address. If you want to be able to get good matches, you have > to use the same normalize function for both the reference and > the input addresses. > > These functions are not about performance they are about > using the tool the correct way. > > -Steve > > > Thanks > > Ravi Ada > > > > -----Original Message----- > > From: postgis-users-boun...@postgis.refractions.net > > [mailto:postgis-users-boun...@postgis.refractions.net] On Behalf Of > > Stephen Woodbridge > > Sent: Wednesday, December 14, 2011 8:14 AM > > To: postgis-users@postgis.refractions.net > > Subject: Re: [postgis-users] Tigerdata for AZ, AS and VI > > > > Ravi, > > > > The process for geocoding follows this: > > > > Load the data: > > 1. get a reference set of streets (ie: the Tiger data) 2. normalize > > the names as you load the data 3. build the indexes you > need for the > > queries > > > > Query for an address: > > 1. normalize the address on input > > 2. query the normalized reference > > > > Ok, so you know most of this because you have already done > it, but the > > important part here is the you normalize BOTH the reference > data and > > the input to a query. This resolves things like: > > > > main street != main st > > > > because the normalize parses the addresses and converts them into a > > normalized standard form so that you can match. Yes it > takes time to > > normalize the request, but if you don't normalize it, then > there is a > > good change that you will not match an appropriate street in the > > reference > set. > > > > -Steve > > > > On 12/14/2011 9:06 AM, Ravi Ada wrote: > >> Regina, > >> > >> Thanks so much for the reply. I ran the > >> missing_indexes_generate_script(), > >> actually it did not return anything, I am assuming all the indexes > >> are in place. That may be because I ran install_missing_indexes() > earlier. > >> I changed the debug flag in geocode_address and it produced a very > >> long query that it runs to geocode the address. I tried to cut and > >> paste the query to run the plan, I am getting errors, I > will figure > >> that > > out. > >> > >> My question is, do we use gecode or geocode_address for faster > >> querying? I noticed that geocode_address takes the > normalized address > >> where as geocode takes address as string parameter. By adding > >> additional normalize_address function when doing the > geocode_address > >> akes > > it run any faster? > >> > >> > >> Thanks > >> Ravi Ada > >> > >> > >> > >> > >> On Wed, 14 Dec 2011 04:18:27 -0500, Paragon Corporation wrote > >>>> I just don't understand why the geocode function takes > so long to > >>>> return the coordinates. I am sure some of you on this list might > >>>> have done the batch geocoding millions of addresses. I may be > >>>> missing just a simple configuration which might make a > whole lot of > >>>> difference in the speed. I don't know what it is. I am following > >>>> the examples exactly from this link > >>>> > (http://postgis.refractions.net/documentation/manual-svn/Geocode.ht > >>>> m > >>>> l) > >>>> > >>>> If someone is familiar with the problem willing to help me using > >>>> GoTo Meeting connection to my machine, I can arrange that too. I > >>>> just have to move along with my project and meet the > deadlines. I > >>>> am already delayed, everybody in my team asking me for > this everyday. > >>>> > >>>> > >>>> Thanks, > >>>> Ravi Ada > >>>> > >>>> > >>> Ravi, > >>> > >>> Sorry been busy with raster stuff so haven't been tuned into this > >>> discussion. > >>> > >>> 1) The indexes the loader generates are not the only ones needed. > >>> Initially I was constantly changing the loader script, > but since we > >>> were changing decisions as we changed code and optimal indexes > >>> needed with aeach change required changing indexes, which indexes > >>> would be best, I created a function that would put them in rather > >>> than bothering with the loader (since a lot of people > would already > >>> have their data loaded) > >>> > >>> Have you tried running that. I suspect you are just > missing indexes > >>> as the timings you are getting are what I used to get earlier on. > >>> > >>> If you haven't run the update script (which runs this > routine anyway) > >>> or run this to get generated script for indexes you > are missing > >>> you should. > >>> > >>> > http://www.postgis.org/documentation/manual-svn/Missing_Indexes_Gene > >>> r > >>> ate_Scr > >>> ipt.html > >>> > >>> 2) There are a couple of other things to note: First > address you do > >>> around an area can take a lot more time because of the > data caching > >>> effects in postgresql. So for the example in the docs > you describe. > >>> > >>> I can do a geocode of 75 State Street,Boston, MA -- and if I > >>> haven't done any geocoding in a while that takes like 1-3 seconds > >>> > >>> Then if I do 80 State Street, Boston, MA -- that subsequent takes > >>> anywhere from 60 ms - 150 ms. > >>> I also don't have all the states loaded since I only > needed it for > >>> about 6 states. thought that should just increase the > planner time > >>> rather than later times. > >>> > >>> 3) For debugging performance there is a variable in the > >>> geocode_address function called var_debug. Its false by default, > >>> change > > it to true. > >>> That spits out the sql being run and is a better sql to > pass to the > >>> planner to check. > >>> > >>> We were hoping to make these debugging features more publically > >>> exposed e.g via a config table, but haven't had the time > to do that. > >>> > >>> Hope this all helps, > >>> Regina > >>> http://www.postgis.us > >>> > >>> _______________________________________________ > >>> postgis-users mailing list > >>> postgis-users@postgis.refractions.net > >>> http://postgis.refractions.net/mailman/listinfo/postgis-users > >> > >> > >> Thanks, > >> Ravi Ada > >> 918-630-7381 > >> > >> _______________________________________________ > >> postgis-users mailing list > >> postgis-users@postgis.refractions.net > >> http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > _______________________________________________ > > postgis-users mailing list > > postgis-users@postgis.refractions.net > > http://postgis.refractions.net/mailman/listinfo/postgis-users > > > > _______________________________________________ > > postgis-users mailing list > > postgis-users@postgis.refractions.net > > http://postgis.refractions.net/mailman/listinfo/postgis-users > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > > _______________________________________________ > postgis-users mailing list > postgis-users@postgis.refractions.net > http://postgis.refractions.net/mailman/listinfo/postgis-users > _______________________________________________ postgis-users mailing list postgis-users@postgis.refractions.net http://postgis.refractions.net/mailman/listinfo/postgis-users