On Sun, May 06, 2012 at 11:10:12PM -0400, Brian DeRocher wrote: > I disagree that make_standard_string() needs to work for all major > languages. I mean it does, but you should know the language ahead > of time. Since the problem of standardizing words is language > related, can you first use the HTTP header Accept-Language to pick > the language (or use geoip), and then standardize according to rules > of that language?
The Accept-Language header is not of much help here because we need to know the language of the search term not the preferred language of the user. Just because my browser is set to English, doesn't mean that I do not search for German or French addresses and use abbriviations when doing so. We could make pretty good guesses about the language of the OSM names because many of them are already tagged with the language and for the others the country the object is in provides a hint. There are also algorithms to guess the language of the search term a user asked for. Using all this information would certainly be helpful to provide a better matching. But nonetheless you have to be careful to remain fuzzy to take into account that you may have guessed wrongly. Sarah _______________________________________________ Geocoding mailing list [email protected] http://lists.openstreetmap.org/listinfo/geocoding

