Re: [OSM-dev] OSM StreetDensityMap
Cool. I've been wanting to give hadoop / map/reduce a try with OSM data but the wiki does not offer much. It would be nice if someone with some experience would create a wiki page. I'm sure it would be interesting for the community as well as GIScience folks to have a place to start. If I've some time I'll create a wiki page for osm on hadoop (and post it here). It gives also a sort-of osm activity map. Well, it does and it doesn't. You'd have to compare it to a reference road network density map to appreciate the activity of the OSM community in representing reality in OSM. That's right. I see a lot of potential for this beyond 'simple' visualisation. Systems like TagInfo and OWL could benefit, maybe? Does your framework lend itself for (near) real time processing of OSM data, or does it only work with snapshot data? MapReduce itself is a programming model. It allows you to process data by defining map- and reduce-functions (and is thus quite easy to learn). It's implemented as a distributed batch processing framework and allows you to process TBs of data on a cluster of up to hundreds of nodes. The real benefit of using such a system is that it scales linearly (well, you could say between O(n) and O(nlogn)) and single systems (like relational DBs) can't scale that high. Our cluster was around 10 nodes, and it took us about 3-4 hours to create the map and store it on HBase (although the cluster was not busy the whole time) [where the uncompressed planet-file is about 200GB]. That said, you can run small jobs on hadoop/mapreduce in a few minutes (= near realtime) and it would be be possible to * process the planet-file once and store the results in a DB * process the planet-file diffs (e.g. hourly) and update the DB TagInfo-like systems (aggregating big data and creating statistics) could definitely be built using hadoop/mapreduce. - npl ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM StreetDensityMap
On Tue, Mar 06, 2012 at 01:31:31PM +0100, npl wrote: TagInfo-like systems (aggregating big data and creating statistics) could definitely be built using hadoop/mapreduce. Or you can do what Taginfo does and just write it cleverly, so it just uses one host for an hour instead of 10 hosts for several hours. :-) I do agree that there are many use-cases for Hadoop Co. But they also create a lot of overhead... Reminds me a bit of the Osmarender/Tiles@Home story: First we write a renderer thats horribly slow and inefficient. So we have to distribute the work load which makes it even more inefficient. Then to keep it going we invent more and more technology around it. Oh well, I liked Osmarender, spent quite a lot of time improving it and rendering maps with it. Sometimes its not about being efficient. :-) Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM StreetDensityMap
Hi, On Tue, Mar 6, 2012 at 7:42 AM, Jochen Topf joc...@remote.org wrote: On Tue, Mar 06, 2012 at 01:31:31PM +0100, npl wrote: TagInfo-like systems (aggregating big data and creating statistics) could definitely be built using hadoop/mapreduce. Or you can do what Taginfo does and just write it cleverly, so it just uses one host for an hour instead of 10 hosts for several hours. :-) I do agree that there are many use-cases for Hadoop Co. But they also create a lot of overhead... Just one host for an hour a day? Wow. That's not very much processing time at all for what it provides. Awesome. -- martijn van exel geospatial omnivore 1109 1st ave #2 salt lake city, ut 84103 801-550-5815 http://oegeo.wordpress.com ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM StreetDensityMap
On Tue, Mar 06, 2012 at 10:44:45AM -0600, Martijn van Exel wrote: On Tue, Mar 6, 2012 at 7:42 AM, Jochen Topf joc...@remote.org wrote: On Tue, Mar 06, 2012 at 01:31:31PM +0100, npl wrote: TagInfo-like systems (aggregating big data and creating statistics) could definitely be built using hadoop/mapreduce. Or you can do what Taginfo does and just write it cleverly, so it just uses one host for an hour instead of 10 hosts for several hours. :-) I do agree that there are many use-cases for Hadoop Co. But they also create a lot of overhead... Just one host for an hour a day? Wow. That's not very much processing time at all for what it provides. Awesome. I just had a look and currently its at about 2h for the main statistics generation. So thats gone up from the 1h it used to have. Thats because people keep asking for more features. :-) It takes about another hour for crawling the wiki etc. Most days it takes about another 1.5h for updating the planet files, but on some days thats considerably slower. I should probably try to figure out why thats the case. Maybe something else runs in parallel on the machine. All of that on a 800MHz machine using about 6GB RAM. The OSM processing is mostly CPU bound so on a modern machine it would be faster. One relatively easy optimization would be to run the planet update and statistics gathering in one step. But for now I am lazy and let Osmosis do the planet update first. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] OSM StreetDensityMap
Hi, On Mon, Mar 5, 2012 at 12:14 PM, npl n...@gmx.de wrote: Hi Last semester, a few other guys and I were involved in a project, where we wanted to get familiar with hadoop. Since OSM has big data, we decided to do some hadoop processing on the osm planet-file. We ended up creating a StreetDensityMap of the world, and extended JMapViewer for the graphical output. (screenshot of europe: https://raw.github.com/npl/** dda/master/screenshots/osm_**density_europe.jpghttps://raw.github.com/npl/dda/master/screenshots/osm_density_europe.jpg ) The project is hosted at github: https://github.com/npl/dda Cool. I've been wanting to give hadoop / map/reduce a try with OSM data but the wiki does not offer much. It would be nice if someone with some experience would create a wiki page. I'm sure it would be interesting for the community as well as GIScience folks to have a place to start. It gives also a sort-of osm activity map. Well, it does and it doesn't. You'd have to compare it to a reference road network density map to appreciate the activity of the OSM community in representing reality in OSM. I see a lot of potential for this beyond 'simple' visualisation. Systems like TagInfo and OWL could benefit, maybe? Does your framework lend itself for (near) real time processing of OSM data, or does it only work with snapshot data? If you want to try it out, you will need your own hadoop cluster (well, a few nodes a few hours long is enough) -- there is no public server available. If you've any questions, don't hesitate to ask me! - npl __**_ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.**org/listinfo/devhttp://lists.openstreetmap.org/listinfo/dev -- martijn van exel geospatial omnivore 1109 1st ave #2 salt lake city, ut 84103 801-550-5815 http://oegeo.wordpress.com ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev