Hello Scott,
thanks for the detailed analysis.
(I started looking at this because I "played" with the program on my netbook
and wasn't able to split even small files like that for Niedersachsen,
that's why I thought the program is in error. In my job I coded many
programs that handle mass data (on IBM mainframes), but I am a bloody
beginner in java)
I think I understand now the idea in your implementation, and I agree that
it it probably the best solution for a whole planet where the relation
between (highest node Id / number of ids) is rather small.
On the other hand, this ratio is going to get worser in the future for
"normal" splits (e.g. germany), esp. when such a high node id is also saved
in the 1st overflow map.
Interesting for me:
I tried to split europe.osm.pbf with default parms and -Xmx2000m : r181
crashed with a gc message, my version finished.
regarding a parm:
I assume the program can decide which algorithmn is better after the 1st
pass, but a parm could also be used.
regarding my Storer class:
I think it reduces space. The normal approach would be to save each id in
the Int2Short HashMap, but I liked your trick with the chunks, as they save
space and reduce the problem of hash collisions and the future problem that
node id will exceed 2^31. My first change was to store each (chunkmask and
chunk) in their own HashMaps, and that caused a lot of overhead compared to
the Storer.
Besides that:
Why is a new chunk initialized with 4 times 4 in chunkMake:
Arrays.fill(out,(short)4);
I used this:
Arrays.fill(out,(short)unassigned);
and I think it works fine. In the original code, the first 4 shorts in chunk
are never used.
Correct?
Ciao,
Gerd
--
View this message in context:
http://gis.638310.n2.nabble.com/splitter-memory-usage-tp6935688p6946578.html
Sent from the Mkgmap Development mailing list archive at Nabble.com.
_______________________________________________
mkgmap-dev mailing list
[email protected]
http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev