Hi, On Sun, May 06, 2012 at 10:25:09AM -0400, Brian DeRocher wrote: > Hey there, > > I'm trying to understand how the nominatim works and i'm perplexed > by this issue, where the state abbreviations get mangled. > > osm=# select make_standard_name( 'VA' ); > make_standard_name > -------------------- > v > > This happens for 5 US states. > > osm=# select st, make_standard_name(st) from brian.states where st > <> make_standard_name(st); > st | make_standard_name > ----+-------------------- > ga | g > il | > in | 1 > la | > va | v > (5 rows) > > This appears to be coming from the tokenstringreplacements.inc.
Yes, that is were it comes from. The idea behind this is to catch common abbriviations so that e.g. 'swan road' will be found even when you are typing 'swan rd'. Problem is that this needs to work for all major languages and that causes quite a few clashes. Above 'il' and 'la' are common articles in French/Italian/Spanish and are therefore deleted (module/nominatim.c line 258 ff.), the others are in tokenstringreplacements.inc and are legal words in a language I don't know at the moment. I was wondering if it would help to not do these string replacements when they constitute a full name (as opposed to words as parts of the full names). It would need some careful consideration of the current uses of make_standard_name, though, because it does not yet know the difference between words and names. Sarah _______________________________________________ Geocoding mailing list [email protected] http://lists.openstreetmap.org/listinfo/geocoding

