I think too that it is possible to break down the problem as described by Frank to get a widely usable geocoder. But I do think to that is not so trivial as the way to described and use addresses varies considerably even in Europe. There are quite a few possibilities of strategies, from very rigid scheme search to quasi full text.
We worked on a geocoder for a client a couple of years ago, implementing some of these ideas trying to use a more or less universal way of geocoding, and a way of describing how the data and addresses are organized. I will try to take look back to it to see if there is ideas or code that could be interesting. Claude Frank Warmerdam wrote: > (Orkney)Toru Mori wrote: >> Apart from technical design, geocoder is "useless without data" :) >> >> Address systems are varied and messy, at least here in Japan and in other >> Asia region. For example, Japan has more than 3 systems. Additionally it >> is very tough to get good "enough" data. There is no separation in address >> text. >> >> We developed geocoder.ja already for our region specifically, but >> unfortunately it won't work in even other countries in Asia. >> http://www.postlbs.org/ja/geocoder >> >> >> A sigle universal, global geocoder may sound perfect. However, there is >> very limited space in terms of standardization as follows. >> >> ----------------------------------------- >> API (can be standardized) >> ----------------------------------------- >> thin parser (might be standardized) >> ----------------------------------------- >> geocoding logic (cannot be standardized) >> ----------------------------------------- >> local dataset (varied and messy) >> ----------------------------------------- >> >> So what OSGeo should lead would be just APIs. If OSGeo wants to >> standardize lower levels, then the project won't finish probably. > > Toru Mori, > > I do think any final solution needs to support plugging in > distinct address parsers and geocoder matching logic for > differ locales and underlying datasets. My understanding is > that some of the commercial data providers have fairly > standard schemes for how to break down address data into a > standard tabular layout - at least for quite a bit of the world. > We might want to learn something from the breakdown approach > they used. > > Generally I agree that we won't be able to write one > universal geocoder but it seems to me a good architecture > with the ability for people to contribute local parsers > and geocoding matchers might be an ideal sort of open source > project. > > Of course, I'm speaking hypothetically (without experience) > and you are speaking from experience! > > Best regards, -- Claude Philipona Camptocamp SA PSE A CH-1015 Lausanne Switzerland +41 21 619 10 11 (direct) +41 21 619 10 10 (centrale) +41 21 619 10 00 (fax) +41 78 648 32 84 (mobile) http://www.camptocamp.com http://www.cartoweb.org
begin:vcard fn:Claude Philipona n:Philipona;Claude org:Camptocamp SA adr:Parc scientifique EPFL;;PSE-A;Lausanne;;CH-1015;Switzerland email;internet:[EMAIL PROTECTED] tel;work:+41 21 619 10 11 tel;fax:+41 21 619 10 00 tel;cell:+41 78 648 32 84 x-mozilla-html:FALSE version:2.1 end:vcard
_______________________________________________ Discuss mailing list Discuss@lists.osgeo.org http://lists.osgeo.org/mailman/listinfo/discuss