I noticed that mkgmap does not intern any strings. In particular, this
tile, generated by the splitter, fails to build with -Xmx3000m on
64-bit jdk under linux. With my patch, mkgmap generates the tile with
-Xmx1000m.
<bounds minlat='55.1953125' minlon='9.4921875' maxlat='56.6015625'
maxlon='11.513671875'/>
This tile has 1m nodes. Among the nodes and ways on this tile, there
are 12m tags, yet only 100k distinct tag key/value pairs; on average
each value occurs 120 times.
I explicitly do not use normal string interning because
String.intern() strings are kept forever, and I want these strings to
be GC'able after the tile is done. I trade GCability for having the
occasional string duplicated in memory by flushing the interning table
every 10k unique strings.
This code is not presently multithread safe; Ideally there should be
one string interning table for each parser/thread.
Scott
Hi Scott!
I think that's a good idea to intern the strings.
As far as I know the LossyIntern class is not needed. The .intern()
function of a string does exactly the same.
Some time ago I sent a very similar patch to the mailing list which is
not yet committed. Could you please test with your use case if it
performs a similar memory reduction?
The patch is thread safe and does not intern all strings. In my opinion
the value of a name tag should not be interned because there is a high
probability that this tag is used once only.
WanMil
Index: src/uk/me/parabola/mkgmap/reader/osm/Tags.java
===================================================================
--- src/uk/me/parabola/mkgmap/reader/osm/Tags.java (revision 1566)
+++ src/uk/me/parabola/mkgmap/reader/osm/Tags.java (working copy)
@@ -19,6 +19,7 @@
import java.util.AbstractMap;
import java.util.Arrays;
import java.util.HashMap;
+import java.util.HashSet;
import java.util.Iterator;
import java.util.Map;
@@ -45,6 +46,18 @@
private String[] keys;
private String[] values;
+
+ /**
+ * Stores all tags which values should be stored as String intern. The
values of
+ * these tags should have a limited number of different values to get a
+ * reasonable memory footprint effect.
+ */
+ private final static HashSet<String> interableValueTags = new
HashSet<String>(
+ Arrays.asList("highway", "building",
"addr:housenumber", "access",
+ "natural", "waterway", "amenity", "oneway",
"surface",
+ "landuse", "lanes", "place", "layer",
"tracktype", "maxspeed",
+ "foot", "bridge", "height", "area", "railway",
"admin_level",
+ "power", "type", "leisure", "barrier"));
public Tags() {
keys = new String[INIT_SIZE];
@@ -65,11 +78,19 @@
Integer ind = keyPos(key);
if (ind == null)
assert false : "keyPos(" + key + ") returns null - size
= " + size + ", capacity = " + capacity;
- keys[ind] = key;
+ // use .intern() to reduce memory footprint
+ keys[ind] = key.intern();
String old = values[ind];
if (old == null)
size++;
+
+ if (interableValueTags.contains(key)) {
+ // use .intern() to reduce memory footprint for the most
+ // common tags with a limited range of values
+ value = value.intern();
+ }
+
values[ind] = value;
return old;
_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev