Hi all, In French, from the top of my head, I can think of :
Rue, Ruelle, Avenue, Boulevard, Quai, Chaussée, Route, Cour, Cours, Cité, Chemin, Place, Esplanade, Passage, Allée, Carrefour, Sentier, Square, Villa. This list is without a doubt not complete but should cover more than 95% of named addresses in France. They should only be ignored from index if they're in the first place and followed by anything else. Cheers, Paco Le 14 févr. 2015 à 08:50, Marko Mäkelä <[email protected]> a écrit : > On Thu, Feb 12, 2015 at 01:24:29PM +0000, Steve Ratcliffe wrote: >> So finally I will merge the mixed index branch. > > I believe that the database terminology for this is 'inverted index' or > 'fulltext index'. > >> I think it would be best to selectively enable it per country along with >> lists of names to avoid. This would be best done by people from or familiar >> with the countries in question. > > In fulltext search, these are called 'stopwords'. > > It might not be necessary to do anything to for countries where street names > are commonly written as a single word. Example: "Main Street" would be > "Hauptstrasse" in German, "Huvudgatan" in Sweden and "Päätie" in Finnish. > Only if the first part of the street name is a proper name such as a person's > name, the second part could be written as a separate word, separated by a > space or dash. > > That said, I guess it would still make sense to introduce some stopwords. > Words that I can think of: > > Swedish: gata, gatan, gränd, gränden, stig, stigen, (stråk, stråket) > Finnish: tie, katu, polku, kuja, (raitti, taival) > German: Straße, Strasse, Weg, Allee, Chaussee > Estonian: mnt, maantee, tn, tänav, pst, puiestee > > In Estonia, it seems to be common to write the tn, mnt or pst as a separate > word. > > I could be missing some stopwords in Estonian and for German-speaking > countries. Also, it could be that the French loan words Allee and Chaussee > are sometimes accented. > > The Finnish and Swedish words that I have put in parenthesis should be very > rare, typically used for ways for non-motorized traffic. I don't think that > including them would pollute the index much. You might in fact want to search > for such a name when you are looking for a nice walking or cycling route > (i.e., you expect there to exist some random-famous-person-name-stråket, but > you do not know the random name). > > Marko > _______________________________________________ > mkgmap-dev mailing list > [email protected] > http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev _______________________________________________ mkgmap-dev mailing list [email protected] http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
