Re: [mkgmap-dev] Fun with splitter tile descriptions

Johann Gail Sat, 12 Sep 2009 05:37:45 -0700

Thanks Johann, this sounds like quite a reasonable approach too. It shouldn'tbe too hard to add in to the splitter (eg by tacking the info on to the densitymap, or by holding on to the is_in data until the areas are known then calculatingthe names as a separate step). Looking at a few osm files the is_in tagsand their values seem inconsistent at best though so I don't know how easyit would be to get sensible/consistent data from them. Another possible issueis that osm files without the tags wouldn't work at all.

Yes, thats true. As far as I can remember, the is_in tags was veryinconsistent. But I had hoped my algorithm is flexible enough to ignorethis inconsistency. The idea was to extract each name at each level ofthe is_in tag. So if I take for example is_in =Germany,Bavaria,Munich,suburb,street name,.... then I will count thefrequency of all five words. With statistical probability Germany willbe the most used word in this tile.Afterwards I try to find unique names for the tiles. The name Germanywill be occur in nearly all tiles, so it is not unique and will not beused. Also the region Bavaria will be in more then one tile and will notbe used. If the city Munich is contained fully in one tile, the namewill get taken, otherwise I will go down to the next. So I will get themost used name which is unique for this tile.

If you do have somecode that deals with filtering/sanitising the is_in data I'd be interestedto see it however as it sounds like it would be worth investigating further.

Find attached a patch, which works against the relative outdated R37.I've tried to update to the recent splitter, but it wont work. There wassome structural changes from SubArea to Area.


Regards,
Johann

Index: src/uk/me/parabola/splitter/SubArea.java
===================================================================
--- src/uk/me/parabola/splitter/SubArea.java	(Revision 37)
+++ src/uk/me/parabola/splitter/SubArea.java	(Arbeitskopie)
@@ -22,7 +22,10 @@
 import java.io.OutputStream;
 import java.io.OutputStreamWriter;
 import java.io.Writer;
+import java.util.Comparator;
 import java.util.Formatter;
+import java.util.HashMap;
+import java.util.TreeMap;
 import java.util.Iterator;
 import java.util.List;
 import java.util.Locale;
@@ -186,6 +189,9 @@
 		while (it.hasNext()) {
 			Map.Entry<String,String> entry = it.next();
 			writer.append("<tag k='");
+//Test:
+if (entry.getKey().equals("is_in"))
+	handleRegionName(entry.getValue());
 			writeAttribute(entry.getKey());
 			writer.append("' v='");
 			writeAttribute(entry.getValue());
@@ -209,4 +215,69 @@
 	void setMapid(int mapid) {
 		this.mapid = mapid;
 	}
+
+//Test:
+	private HashMap<String, Integer> nameMap = new HashMap<String, Integer>();
+	
+	// Remember all region, city, country names in a sorted list and count frequency.
+	private void handleRegionName(String name) {
+		String[] names = name.split("[,;]");
+		for (String n : names) {
+			n = n.trim();
+			if (nameMap.containsKey(n)) {
+				Integer count = nameMap.get(n);
+				nameMap.put(n,count+1);
+			}
+			else
+				nameMap.put(n,1);
+		}
+	}
+
+	public boolean containsName(String name) {
+		return nameMap.containsKey(name);
+	}
+
+
+	public String getRegionName(AreaList areas) {
+		// First sort the name list by frequency.
+		TreeMap<Integer, String> sortedMap = new TreeMap<Integer,String>(new ReverseComparator());		
+		for (Map.Entry<String,Integer> entry : nameMap.entrySet()) 
+			sortedMap.put(entry.getValue(), entry.getKey());
+
+		// Find the mostly used unique names and return them.
+		// This should scale in a good manner over all tile sizes.
+		StringBuilder region = new StringBuilder();
+		int nameCount = 3;
+		for (Map.Entry<Integer,String> entry : sortedMap.entrySet()) {
+			String name = entry.getValue();
+			// An 'unique' element can appear in max 2 subareas.
+			int isUnique = 2;
+			for (SubArea a : areas) {
+				if (a!=this && a.containsName(name)) {
+					if (isUnique-- < 0)
+						break;
+				}					
+			}
+			if (isUnique >= 0) {
+				//region.append(entry.getKey());
+				//region.append("=");
+				region.append(name);
+				if (--nameCount <= 0)
+					break;
+				region.append(",");
+			}
+		}
+		return region.toString();
+	}
+
+	// Sorts the Integers big numbers first.
+	private class ReverseComparator implements Comparator<Integer> {
+		public int compare (Integer a, Integer b) {
+			return -a.compareTo(b);
+		}
+
+		public boolean equals(Integer a, Integer b) {
+			return a.equals(b);
+		}
+	}
 }
Index: src/uk/me/parabola/splitter/Main.java
===================================================================
--- src/uk/me/parabola/splitter/Main.java	(Revision 37)
+++ src/uk/me/parabola/splitter/Main.java	(Arbeitskopie)
@@ -18,6 +18,7 @@
 
 import org.apache.tools.bzip2.CBZip2InputStream;
 import org.xml.sax.SAXException;
+import org.apache.tools.bzip2.CBZip2InputStream;
 
 import javax.xml.parsers.ParserConfigurationException;
 import javax.xml.parsers.SAXParser;
@@ -244,6 +245,7 @@
 		for (SubArea a : areaList) {
 			w.println();
 			w.format("mapname: %d\n", a.getMapid());
+			w.format("area-name: %s\n", a.getRegionName(areaList));
 			w.println("description: OSM Map");
 			w.format("input-file: %d.osm.gz\n", a.getMapid());
 		}

EigenschaftsÃ¤nderungen: src/org/apache/tools/bzip2/BZip2Constants.java
___________________________________________________________________
HinzugefÃ¼gt: svn:executable
   + *


EigenschaftsÃ¤nderungen: src/org/apache/tools/bzip2/CBZip2InputStream.java
___________________________________________________________________
HinzugefÃ¼gt: svn:executable
   + *

_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Re: [mkgmap-dev] Fun with splitter tile descriptions

Reply via email to