On Tue, Mar 06, 2012 at 10:44:45AM -0600, Martijn van Exel wrote:
> On Tue, Mar 6, 2012 at 7:42 AM, Jochen Topf <joc...@remote.org> wrote:
> 
> > On Tue, Mar 06, 2012 at 01:31:31PM +0100, npl wrote:
> > > TagInfo-like systems (aggregating big data and creating statistics)
> > > could definitely be built using hadoop/mapreduce.
> >
> > Or you can do what Taginfo does and just write it cleverly, so it just
> > uses one
> > host for an hour instead of 10 hosts for several hours. :-) I do agree that
> > there are many use-cases for Hadoop & Co. But they also create a lot of
> > overhead...
> >
> >
> Just one host for an hour a day? Wow. That's not very much processing time
> at all for what it provides. Awesome.

I just had a look and currently its at about 2h for the main statistics
generation. So thats gone up from the 1h it used to have. Thats because
people keep asking for more features. :-)

It takes about another hour for crawling the wiki etc.

Most days it takes about another 1.5h for updating the planet files, but on
some days thats considerably slower. I should probably try to figure out why
thats the case. Maybe something else runs in parallel on the machine.

All of that on a 800MHz machine using about 6GB RAM. The OSM processing is
mostly CPU bound so on a modern machine it would be faster. One relatively
easy optimization would be to run the planet update and statistics gathering
in one step. But for now I am lazy and let Osmosis do the planet update
first.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev

Reply via email to