Re: [mkgmap-dev] New splitter version, big memory savings
Hi Chris, new splitter works great. attached a small patch to make the kml tiles transparent in Google Earth and add a simple balloon with the tile name when you click it. GE behaves different than GMaps where this is not needed. But patch will work in GMaps too. If this is useful for others please apply. I am not a kml specialist and used the simplest example available. Have one usability request. If the specified directory for the cache doesn't exist the function is disabled. It should be safe to create the directory in splitter. thanks apo kml.diff Description: Binary data On 3 Sep 2009, at 3:42 , Chris Miller wrote: I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. As an example, it used to take about a 5GB heap to generate areas.list for the whole planet, but now only takes around 300MB(!). An additional advantage is that as the planet grows in size and complexity going forwards, the memory required during the first stage will not increase. This change should finally mean that anyone is able to split the planet even on a relatively low end machine (though be prepared for a long wait!). If you try but run out of memory during the second stage of the split (ie after areas.list has been generated), reduce the value of --max-areas. This will reduce the memory required during that second stage, at the cost of additional passes over the data (if you do require multiple passes then I highly recommend you also use the --cache option, it can make a huge difference to performance). One possible downside to this new version is the algorithm that decides how to split up the map has been changed somewhat. This results in slightly different tile layouts compared to the old algorithm. This shouldn't be too noticable unless perhaps it puts a tile boundary right through somewhere you care about when the old algorithm didn't. Of course, it may also work in your favour for the same reason! My experiments have shown the number of tiles generated increases by about 2% with the new approach which I think is a small price to pay for the huge memory saving. I've put some example kml files online here that show the before and after effects of the change: UK --max-nodes=160 old splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-original.kmlz=3 new splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-density.kmlz=3 Europe --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-density.kmlz=3 Europe --max-nodes=30 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-density.kmlz=3 Planet --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-density.kmlz=3 If you aren't happy with the new tiles, or simply want to compare the new with the old, there's a parameter --legacy-mode=true that will generate areas.list using the old approach. Assuming there aren't any serious problems that come to light I'll be removing that parameter in a future build. Where to from here? As some of you may have guessed, this new splitter is based on a 'density map' as discussed in earlier mails. Currently it maps the density of nodes only, and the generated map is only held in memory long enough to calculate the tile boundaries. I indend to write this map out to disk so it can be used by external tools, or reused on successive runs of the splitter to allow extremely quick generation of areas.list with different --max-nodes settings. After that I hope to tackle the quite difficult problem (in terms of performance and memory overhead) of generating density maps for ways and relations too. Once we have density maps for all three element types it will hopefully be possible to generate tiles that are as big as possible but still avoid giving 'Map to big' messages. Another related area I'm starting to look at is new/improved algorithms for arranging the tiles so they eg avoid putting boundaries through the middle of cities, or reduce the overall number of tiles. Any ideas here would be appreciated. Enjoy! Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Hi Apo, AS new splitter works great. attached a small patch to make the kml AS tiles AS transparent in Google Earth and add a simple balloon with the tile AS name when you click it. GE behaves different than GMaps where this Thanks, this is something I'd noticed and was meaning to take a look at so thanks for saving me the trouble. I've just commited this change. AS Have one usability request. If the specified directory for the cache AS doesn't exist the function is disabled. It should be safe to create AS the directory in splitter. Makes sense, I'll add this as soon as I get a chance (likely tomorrow). Cheers, Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Chris, thanks for al the good work. Just a small and unrelated remark. The script that builds my map first unpacks the osm.bz2 file, then runs splitter. Still, Splitter complains about no --cache being used, while as far as I understand, there's no real advantage using --cache if you're using uncompressed files, or is there? Best regards, Valentijn Chris Miller schreef: I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. [...] ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Oh, this is so awesome Chris! Just awesome!! Chris Miller wrote: I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. As an example, it used to take about a 5GB heap to generate areas.list for the whole planet, but now only takes around 300MB(!). An additional advantage is that as the planet grows in size and complexity going forwards, the memory required during the first stage will not increase. This change should finally mean that anyone is able to split the planet even on a relatively low end machine (though be prepared for a long wait!). If you try but run out of memory during the second stage of the split (ie after areas.list has been generated), reduce the value of --max-areas. This will reduce the memory required during that second stage, at the cost of additional passes over the data (if you do require multiple passes then I highly recommend you also use the --cache option, it can make a huge difference to performance). One possible downside to this new version is the algorithm that decides how to split up the map has been changed somewhat. This results in slightly different tile layouts compared to the old algorithm. This shouldn't be too noticable unless perhaps it puts a tile boundary right through somewhere you care about when the old algorithm didn't. Of course, it may also work in your favour for the same reason! My experiments have shown the number of tiles generated increases by about 2% with the new approach which I think is a small price to pay for the huge memory saving. I've put some example kml files online here that show the before and after effects of the change: UK --max-nodes=160 old splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-original.kmlz=3 new splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-density.kmlz=3 Europe --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-density.kmlz=3 Europe --max-nodes=30 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-density.kmlz=3 Planet --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-density.kmlz=3 If you aren't happy with the new tiles, or simply want to compare the new with the old, there's a parameter --legacy-mode=true that will generate areas.list using the old approach. Assuming there aren't any serious problems that come to light I'll be removing that parameter in a future build. Where to from here? As some of you may have guessed, this new splitter is based on a 'density map' as discussed in earlier mails. Currently it maps the density of nodes only, and the generated map is only held in memory long enough to calculate the tile boundaries. I indend to write this map out to disk so it can be used by external tools, or reused on successive runs of the splitter to allow extremely quick generation of areas.list with different --max-nodes settings. After that I hope to tackle the quite difficult problem (in terms of performance and memory overhead) of generating density maps for ways and relations too. Once we have density maps for all three element types it will hopefully be possible to generate tiles that are as big as possible but still avoid giving 'Map to big' messages. Another related area I'm starting to look at is new/improved algorithms for arranging the tiles so they eg avoid putting boundaries through the middle of cities, or reduce the overall number of tiles. Any ideas here would be appreciated. Enjoy! Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
2009/9/3 Chris Miller chris.mil...@kbcfp.com: I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. As an example, it used to take about a 5GB heap to generate areas.list for the whole planet, but now only takes around 300MB(!). An additional advantage is that as the planet grows in size and complexity going forwards, the memory required during the first stage will not increase. Impressive! Many, many thanks for the effort. I'll test it on weekend. Paul -- Don't take life too seriously; you will never get out of it alive. -- Elbert Hubbard ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Really great, what we would need now is a possibitlity to split by countries. e.g. taking europe.osm.bz2 and splitting it into all major states, this would avoid having to use the tiles from geofabrik which cannot be merged without having broken routing at the frontiers. Has anyone any idea how we could do that? It's not only for osm but also important for srtm (.hgt) files, or .hgt converted to osm with srt2osm. Osmosis is not really the good tool to do that (it breaks routing at the tile boarder), but it has the possibility to use bounding polygons for cutting out pieces. Chris Miller wrote: I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. As an example, it used to take about a 5GB heap to generate areas.list for the whole planet, but now only takes around 300MB(!). An additional advantage is that as the planet grows in size and complexity going forwards, the memory required during the first stage will not increase. This change should finally mean that anyone is able to split the planet even on a relatively low end machine (though be prepared for a long wait!). If you try but run out of memory during the second stage of the split (ie after areas.list has been generated), reduce the value of --max-areas. This will reduce the memory required during that second stage, at the cost of additional passes over the data (if you do require multiple passes then I highly recommend you also use the --cache option, it can make a huge difference to performance). One possible downside to this new version is the algorithm that decides how to split up the map has been changed somewhat. This results in slightly different tile layouts compared to the old algorithm. This shouldn't be too noticable unless perhaps it puts a tile boundary right through somewhere you care about when the old algorithm didn't. Of course, it may also work in your favour for the same reason! My experiments have shown the number of tiles generated increases by about 2% with the new approach which I think is a small price to pay for the huge memory saving. I've put some example kml files online here that show the before and after effects of the change: UK --max-nodes=160 old splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-original.kmlz=3 new splitter: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fuk-density.kmlz=3 Europe --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-density.kmlz=3 Europe --max-nodes=30 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Feurope-300k-density.kmlz=3 Planet --max-nodes=160 old: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-original.kmlz=3 new: http://maps.google.co.uk/maps?q=http:%2F%2Fredyeti.net%2Fosm%2Fplanet-density.kmlz=3 If you aren't happy with the new tiles, or simply want to compare the new with the old, there's a parameter --legacy-mode=true that will generate areas.list using the old approach. Assuming there aren't any serious problems that come to light I'll be removing that parameter in a future build. Where to from here? As some of you may have guessed, this new splitter is based on a 'density map' as discussed in earlier mails. Currently it maps the density of nodes only, and the generated map is only held in memory long enough to calculate the tile boundaries. I indend to write this map out to disk so it can be used by external tools, or reused on successive runs of the splitter to allow extremely quick generation of areas.list with different --max-nodes settings. After that I hope to tackle the quite difficult problem (in terms of performance and memory overhead) of generating density maps for ways and relations too. Once we have density maps for all three element types it will hopefully be possible to generate tiles that are as big as possible but still avoid giving 'Map to big' messages. Another related area I'm starting to look at is new/improved algorithms for arranging the tiles so they eg avoid putting boundaries through the middle of cities, or reduce the overall number of tiles. Any ideas here would be appreciated. Enjoy! Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Hi Valentijn, It depends... how many areas does your osm file get split into, and what value (if any) are you using for --max-areas? Are you providing your own areas.list file via the --split-file parameter? It's still noticably slower to parse an osm file (even an uncompressed one) than it is to read the data from the cache. Offsetting this is the overhead required to create the cache in the first place. So it all comes down to the number of passes over the data that are required to perform the split. If it's only a single pass, the cache is a hinderance (even with a compressed osm file). If two passes are required, the cache is a definite win if your osm file is compressed, and about break-even if it is uncompressed. If three or more passes are required, the cache is a definite win regardless of whether your osm files are compressed or not. The first stage of the split always requires one (and only one) pass. The only time the first stage is skipped is when you specify the --split-file parameter. The second stage of the split always requires at least one additional pass, possibly more. The number of passes required in stage two equals: T / M (rounded up to the nearest whole number) where: T = the total number of areas generated by the split M = the --max-areas value (default = 255) Note also that if you intend to run the splitter more than once against the same osm file (eg because you are experimenting with different parameters for --max-nodes), then it's also going to be a big win if you use the cache since it can be reused across runs and the osm file never needs reparsing. Hope that helps clarify things for you. Chris VS Chris, thanks for al the good work. Just a small and unrelated VS remark. The script that builds my map first unpacks the osm.bz2 VS file, then runs splitter. Still, Splitter complains about no --cache VS being used, while as far as I understand, there's no real advantage VS using --cache if you're using uncompressed files, or is there? VS VS Best regards, VS VS Valentijn VS VS Chris Miller schreef: VS I've just checked in a new version of the splitter (r84) that requires far less memory and also performs slightly better during the first stage of the split. ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Hi Chris Where to from here? As some of you may have guessed, this new splitter is Here are a few ideas based on things that we discovered with earlier versions of the splitter. * Avoid pathalogical behaviour at the poles by limiting latitude to +- 85. * Split tiles that are larger than a given absolute size, as even an empty file that is big enough will fail. Not sure what that size is, but 63240001 is probably over it. * There may also be a problem at 180 degrees longitude, caused by the overlap. It might work as long as nothing in the chain normalises, for example, 181 to -179. * Trim off areas that are completly empty, this might help with tiles that are mostly ocean with little bits of land around the edges. ..Steve ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Hi Steve, SR Here are a few ideas based on things that we discovered with earlier SR versions of the splitter. Thanks for this, these points are all new to me. SR * Avoid pathalogical behaviour at the poles by limiting latitude to SR +- 85. What should I do with nodes/ways/rels outside this limit? I assume just discard them and don't let any tile extend beyond +/-85 (but still include the discarded nodes in the overlap)? SR * Split tiles that are larger than a given absolute size, as even SR an empty file that is big enough will fail. Not sure what that SR size is, but 63240001 is probably over it. Larger than a specified width or height, or larger than a certain area? What exactly do you mean by will fail (so I know how to test my changes)? SR * There may also be a problem at 180 degrees longitude, caused by the SR overlap. It might work as long as nothing in the chain SR normalises, for example, 181 to -179. Yes I was wondering about this case earlier. Can you explain a bit more what you mean by It might work as long as nothing in the chain normalises though, I'm not sure I follow. Suppose I had a tile that ranges from 170 - 180. If I extend the overlap so that it picks up nodes from say -180 - -178, how does mkgmap deal with that? Or are you suggesting I include those nodes but adjust their longitudes to a range of 180 - 182 instead and mkgmap will then do the right thing? SR * Trim off areas that are completly empty, this might help with SR tiles that are mostly ocean with little bits of land around SR the edges. In the above sentence I take it by areas you mean portions of a single tile? That shouldn't be too hard to do now the density map's in place. By the same logic I assume I can throw away completely any tile that contains no nodes at all? Is there likely to be any problems as a result of the gaps that are introduced between tiles? Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
L And are (large) polygons that span many tiles fully supported? The splitter has been able to handle relations that span an unlimited number of tiles since r49, and ways that span many tiles since r60 or so. As far as I'm aware the only remaining problem is that a node is still limited to 4 tiles max. This limit can be exceeded with very small --max-nodes values, since the tiles can end up so small that their overlaps start to interfere with each other. A warning is displayed in this situation. I know how to fix this if need be (basically the same way I fixed it for ways and relations), though it introduces a minor performance and memory cost and I didn't think this was a problem in the wild. Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Ok, thanks for clearing this up for me. I don't actually know if there is a real world problem with this, it's just that I was still thinking that this was a limitation... Chris Miller wrote: L And are (large) polygons that span many tiles fully supported? The splitter has been able to handle relations that span an unlimited number of tiles since r49, and ways that span many tiles since r60 or so. As far as I'm aware the only remaining problem is that a node is still limited to 4 tiles max. This limit can be exceeded with very small --max-nodes values, since the tiles can end up so small that their overlaps start to interfere with each other. A warning is displayed in this situation. I know how to fix this if need be (basically the same way I fixed it for ways and relations), though it introduces a minor performance and memory cost and I didn't think this was a problem in the wild. Chris ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
On 03/09/09 14:49, Chris Miller wrote: L And are (large) polygons that span many tiles fully supported? The splitter has been able to handle relations that span an unlimited number of tiles since r49, and ways that span many tiles since r60 or so. As far I'm not sure if this what Lambertus was thinking of but what about polygons that completely encircle a tile, or are so big that they cut across a tile in such a way that joining up the nodes in the overlap area cuts across the tile. I don't think those things can work. Probably doesn't happen for real though. Certainly no one has reported a case where that appeared to happen. ..Steve ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev
Re: [mkgmap-dev] New splitter version, big memory savings
Very good point. I guess if someone came across such a situation a nasty workaround would be to increase the --overlap value by enough to make the problem go away, though that would likely introduce other problems (memory usage, performance, nodes in 4 areas, ...). A proper fix would require some extra analysis of the 'big ways' or 'big relations' (the splitter is already vaguely aware of these) to decide how best to deal with them. But that's not something I feel like implementing if it can be helped :) Chris SR I'm not sure if this what Lambertus was thinking of but what about SR polygons that completely encircle a tile, or are so big that they SR cut across a tile in such a way that joining up the nodes in the SR overlap area cuts across the tile. I don't think those things can SR work. SR SR Probably doesn't happen for real though. Certainly no one has SR reported a case where that appeared to happen. SR SR ..Steve SR ___ mkgmap-dev mailing list mkgmap-dev@lists.mkgmap.org.uk http://www.mkgmap.org.uk/mailman/listinfo/mkgmap-dev