I think too that it is possible to break down the problem as described
by Frank to get a widely usable geocoder. But I do think to that is not
so trivial as the way to described and use addresses varies considerably
even in Europe. There are quite a few possibilities of strategies, from
very rigid scheme search to quasi full text.

We worked on a geocoder for a client a couple of years ago, implementing
some of these ideas trying to use a more or less universal way of
geocoding, and a way of describing how the data and addresses are
organized. I will try to take look back to it to see if there is ideas
or code that could be interesting.

Claude

Frank Warmerdam wrote:
> (Orkney)Toru Mori wrote:
>> Apart from technical design, geocoder is "useless without data" :)
>>
>> Address systems are varied and messy, at least here in Japan and in other 
>> Asia region. For example, Japan has more than 3 systems. Additionally it 
>> is very tough to get good "enough" data. There is no separation in address 
>> text. 
>>
>> We developed geocoder.ja already for our region specifically, but 
>> unfortunately it won't work in even other countries in Asia. 
>> http://www.postlbs.org/ja/geocoder
>>
>>
>> A sigle universal, global geocoder may sound perfect. However, there is 
>> very limited space in terms of standardization as follows.
>>
>> -----------------------------------------
>>      API (can be standardized)
>> -----------------------------------------
>>  thin parser (might be standardized)
>> -----------------------------------------
>>  geocoding logic (cannot be standardized)
>> -----------------------------------------
>>      local dataset (varied and messy)
>> -----------------------------------------
>>
>> So what OSGeo should lead would be just APIs. If OSGeo wants to 
>> standardize lower levels, then the project won't finish probably.
> 
> Toru Mori,
> 
> I do think any final solution needs to support plugging in
> distinct address parsers and geocoder matching logic for
> differ locales and underlying datasets.  My understanding is
> that some of the commercial data providers have fairly
> standard schemes for how to break down address data into a
> standard tabular layout - at least for quite a bit of the world.
> We might want to learn something from the breakdown approach
> they used.
> 
> Generally I agree that we won't be able to write one
> universal geocoder but it seems to me a good architecture
> with the ability for people to contribute local parsers
> and geocoding matchers might be an ideal sort of open source
> project.
> 
> Of course, I'm speaking hypothetically (without experience)
> and you are speaking from experience!
> 
> Best regards,


-- 
Claude Philipona
Camptocamp SA
PSE A
CH-1015 Lausanne
Switzerland

+41 21 619 10 11 (direct)
+41 21 619 10 10 (centrale)
+41 21 619 10 00 (fax)
+41 78 648 32 84 (mobile)
http://www.camptocamp.com http://www.cartoweb.org
begin:vcard
fn:Claude Philipona
n:Philipona;Claude
org:Camptocamp SA
adr:Parc scientifique EPFL;;PSE-A;Lausanne;;CH-1015;Switzerland
email;internet:[EMAIL PROTECTED]
tel;work:+41 21 619 10 11
tel;fax:+41 21 619 10 00
tel;cell:+41 78 648 32 84
x-mozilla-html:FALSE
version:2.1
end:vcard

_______________________________________________
Discuss mailing list
Discuss@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/discuss

Reply via email to