On Tue, Mar 06, 2012 at 10:44:45AM -0600, Martijn van Exel wrote: > On Tue, Mar 6, 2012 at 7:42 AM, Jochen Topf <joc...@remote.org> wrote: > > > On Tue, Mar 06, 2012 at 01:31:31PM +0100, npl wrote: > > > TagInfo-like systems (aggregating big data and creating statistics) > > > could definitely be built using hadoop/mapreduce. > > > > Or you can do what Taginfo does and just write it cleverly, so it just > > uses one > > host for an hour instead of 10 hosts for several hours. :-) I do agree that > > there are many use-cases for Hadoop & Co. But they also create a lot of > > overhead... > > > > > Just one host for an hour a day? Wow. That's not very much processing time > at all for what it provides. Awesome.
I just had a look and currently its at about 2h for the main statistics generation. So thats gone up from the 1h it used to have. Thats because people keep asking for more features. :-) It takes about another hour for crawling the wiki etc. Most days it takes about another 1.5h for updating the planet files, but on some days thats considerably slower. I should probably try to figure out why thats the case. Maybe something else runs in parallel on the machine. All of that on a 800MHz machine using about 6GB RAM. The OSM processing is mostly CPU bound so on a modern machine it would be faster. One relatively easy optimization would be to run the planet update and statistics gathering in one step. But for now I am lazy and let Osmosis do the planet update first. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 _______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev