Hello Klaus,

here is a comparison to the trunk version r202  (I hope it is complete without 
being to complex). 
It would be great if someone else could put this into a better readable format 
once the
changes are in the trunk version.

Corrections:- Prevent overflow in node counters reported here:
http://gis.19327.n5.nabble.com/Bug-in-splitter-tp5610856.html
- Missing data because of rounded/trimmed bounding boxes, reported here:
http://gis.19327.n5.nabble.com/mkgmap-splitter-or-mkgmap-leave-out-information-on-luxembourg-osm-pbf-from-geofabrik-tp5731208.html

Added debugging features :+ parameter stop-after allows to stop execution after 
a given program phase was executed. This saves time
when debugging / testing the new split algorithm (see below)
+ parameter output=simultate allows to simulate the whole split process without 
writing data to tiles.
I use this to avoid writing masses of data to my SSD 
+ if no split-file is given, splitter will write a file densities-out.txt 
containing the densitiy data that was used
to calculate the tile areas. When debugging, you can rename this file to 
densities.txt and place it into the 
same directory as splitter.jar. If splitter finds such a file, it will read the 
content of the file instead of parsing
the input file. This saves a lot of time (the densities.txt for whole planet 
has just ~ 47Mb)

Other new features (less important first): + in addition to the areas.list file 
splitter writes an area.poly in the osmosis polygon file format.
+  o5m format supported for reading and writing:
For input, the file name has to end with .o5m, for output you have to specify 
parameter --output=o5m .
The o5m format requires more disk space but is faster to read. This is 
espicially true on slower cpus.
+ polygon file handling:
With parameter --polygon-file you can pass a bounding polygon to splitter. This 
is probaly only useful
when you want to use  an input file that contains much more data than the map 
that you want to create, 
for example you may create a polygon file covering scandinavia and use europe 
or planet as input.
The polygon file is only used when splitter has to calculate the areas (no 
--split-file parameter given) and it is 
only used to calculate the areas. With a given polygon file, a special split 
algorithm is used 
which tries to create tiles that cover the bounding polygon completely, but not 
too much outside of the
polygon. The parameter no-trim is ignored if --polygon-file is used.

+ a new split algorithm was implemented to address two problems:
 ++ r202 may create tiles with only a few nodes, this leads to serious problems 
described here:
http://gis.19327.n5.nabble.com/Serious-Bug-Mkgmap-creating-map-that-puts-news-GPS-confirmed-etrex-30-Oregon-550-into-bootloop-tp5508055p5512646.html
 ++ r202 with no-trim=true may create huge, almost empty tiles, this leads to 
problems in mkgmap. Details
were described here:
http://www.mkgmap.org.uk/pipermail/mkgmap-dev/2012q1/013611.html
The new algorithm tries to optimize the created tiles so that
- the number of tiles is small
- the aspect ratio is near 1 (values between 0.25 and 4 are considered to be 
nice)
- no tile contains less than max-nodes/3 nodes 
- no tile is larger than 90° in longitudes and 85° in latitude 

It is not alwys possible to find a split that meats all these goals, esp. not 
if you provide a bounding polygon.
A few users reported problems in mkgmap with the results of the new algorithm 
(higher memory needs,
smaller max-jobs parm needed)
It is not yet clear if these problems are to be solved in splitter or in mkgmap.

+ problem-list handling:
Two new approaches have been implemented to solve the frequently reported 
problem of flooding:
http://gis.19327.n5.nabble.com/Still-problems-with-lakes-tp5725668.html

These problems are caused by the split process. Splitter r202 simply divides 
multipolygon relations 
into parts that lie within one tile. Later, mkgmap has to guess how the 
original polygon was closed.
This guessing fails from time to time. The solution in r202 is to specify a 
large enough overlap value. 

Approach 1)
The new parameter --problem-file allows to specify a list of known problem 
relations and ways. 
A list containing many problem cases can be found here:
http://wiki.openstreetmap.org/wiki/Mkgmap/help/problematic_polygons
To use such a file you have to specify --problem-file=<path to file>
A way or relation listed in this file is treated specially by splitter:
 - ways: 
 ++ if the way is closed (first and last node reference are equal), splitter 
calculates
the bounding box of the way and writes the complete way to each tile that 
intersects with 
the bounding box (complete means with all referenced nodes that were found in 
the input file)
 ++ if the way is not closed, splitter calculates the tiles that are crossed by 
the way
and writes the complete data to those tiles

- relations: 
A relation is completely written to all tiles that 
- contain one or more nodes listed as members of the relation
- contain one or more nodes listsed as members of the ways of the relation
- are crossed by one or more ways of the relation
- are enclosed by one or more ways of a type=multipolygon relation
Note that a relation with type=multipolygon is treated similar to a single 
closed way, splitter calulates the
bounding box of each area enclosed by one or more ways building a closed 
polygon.
The complete relation is written to each tile that intersects with any of the 
calculated bounding
boxes.

The problem file still has some disadvantages:
- not up to date until maintanance
- user has to verify the result of the map, if something is wrong, he has to 
find the id
of the way or relation that causes the problem, add it to the problem file and 
restart the 
whole process of map creation.  This can be very time consuming and it is still 
likely that
the user will not find all broken polygons. The solution is 

Approach 2)
the parameter --keep-complete
which should be used instead of --problem-file 
With keep-complete splitter reads the input file multiple times to detect those 
polygons
that are divided during the split process. Splitter thus creates the list of 
problem 
cases and handles them exactly the same way as described above.

Advantage of --keep-complete compared to --problem-file :
- no need to maintain a list of problem cases
Advantage of --keep-complete and problem-file compared to a large --overlap 
value:
- makes sure that problem polygons are complete
 (of course only if input file is complete)
- doesn't write a lot of "noise" like houes or road which are in the overlap 
area,
but not at all related to the bounding box of the tile

Drawback of --keep-complete compared to --problem-file:
- Splitter is slower because it has to read the input file more often and the 
processing of all problem ways and relations requires additional memory on heap.
On a 32bit system, it is not possible to split whole planet with 
--keep-complete, 
because you need around 4GB of heap to process all the problem cases.
On the other hand, on a 64bit system with at least 8GB you can split 
planet using e.g.
java -Xmx7000m -jar splitter.jar --max-areas=2048 --keep-complete--output=xml  
planet.o5m
(note that I use output=xml because both the o5m and pbf writer require too 
much heap
to write ~1500 areas in one pass. Each open *.o5m or *.pbf file requires more 
than 1Mb
for string tables and other stuff, the xml writer needs almost no fixed storage)

-For some tiles, unneeded data is written if they lie within the bounding box 
of 
huge multipolygon relation, but not within any of the polygons described by the 
relation.

Gerd


> Date: Sun, 9 Dec 2012 12:49:31 -0800
> From: [email protected]
> To: [email protected]
> Subject: Re: [mkgmap-dev] splitter r254
> 
> Hi Gerd,
> 
> release candidate ... sounds good after all the hard work.
> 
> Is it possible for you to write a (short) summery concerning all changes and
> enhancements ?
> (I have to admit that I got lost in "all" the splitter threads.)
> 
> Regards Klaus
> 
> 
> 
> --
> View this message in context: 
> http://gis.19327.n5.nabble.com/splitter-r254-tp5739717p5739736.html
> Sent from the Mkgmap Development mailing list archive at Nabble.com.
> _______________________________________________
> mkgmap-dev mailing list
> [email protected]
> http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
                                          
_______________________________________________
mkgmap-dev mailing list
[email protected]
http://lists.mkgmap.org.uk/mailman/listinfo/mkgmap-dev

Reply via email to