The amusing recent FakeSteveC ... I guess I will call it a LOLSCALE got me 
thinking about what people actually think of the boards comment on scaling;

http://fakestevec.blogspot.com/2011/01/know-your-osm-memes-2.html

As much as I want a dialogue with my fake self, a discourse on the thrust of 
the argument is I think merited.

I think scaling is the number one issue OSM should tackle technically.

The days of just 'buy a bigger database server' are I think over. It's not very 
elegant and it's just too damn expensive. Perhaps we could do another 
iteration, but if OSM bandwidth continues to outpace moore's law and donations 
then it just doesn't work.

So that means scaling horizontally to more than one machine. And if you're 
doing that, you may as well do more than 2 machines, or more than 20, or 
whatever figure you have in your head.

I think this is number one because I think the amount of data OSM is going to 
have to deal with is going to explode in a fairly short time scale. I don't 
mean just another big import. Sadly I can't be public but I had a conversation 
with a large company over a year ago (no, it's not MS or CM) who  speculated 
about putting OSM on the front page of their maps product, which would 
approximately turn all of our yearly statistics to daily or weekly numbers. We 
went through a decision tree about how that could happen. Every leaf node on 
that tree came back as basically we couldn't do it.

Could we accept the edit traffic? No, far too much. Could we provide a good 
user experience, clearly no. Could they help us scale? No they would be viewed 
as taking over on any kind of timescale they needed. Could they host us? Again 
no, it would be too slow of a process and it'd be a takeover and the community 
would probably reject it.

I could continue, but the basic direction you can imagine. Imagine you had 
millions of daily users and you wanted to use OSM in a respectable 
community-driven community way. And let's say you get over the 4chan rhetoric 
over on t...@. If you think through it, within any reasonable time frame (like 
6-12 months) it's very hard to make that happen, and so you may as well go 
build your own things. Which I think sucks and is a loss for OSM.

Now this conversation has come up a few more times recently with other large 
mapping companies. And I feel like I'm rehashing those conversations above. I'd 
love to be public about it, but those companies aren't ready to talk yet.

Even if people weren't privately proposing notching up our traffic a few orders 
of magnitude, it would still make a lot of sense to figure out how to scale.

Back to FakeSteveC and the negative eye-rolling comment on thinking about this 
for a few seconds. Well it turns out we have. The board specifically didn't 
list any technical measure on purpose, that's not it's job. But the direction 
of supporting and encouraging basic things like scaling is I think well within 
the bounds.

I haven't a clue what we should use to scale horizontally. There are a few 
major architectural choices and then within those there are lots of 
implementations. Some are too new and buggy, some are in the wrong language ... 
it's clearly a bit of a mess out there right now. There are also a bunch of 
religious beliefs around how you do this stuff too.

So, how do we get from here to there? Speaking strictly personally, I think one 
of the best uses of funds in or out of OSM has been bug bounties. Personally, I 
think putting up some bounties on demoing either architectures or 
implementations is a good idea, because we all know it comes down to working 
code. Something like "$1,000 to the first person who demonstrates OSMs DB 
running on more than one machine" then another $1000 for proving it can handle 
a certain throughput and so on is one way to get there. That's the way 
personally I'd like to encourage it to happen, but that's neither been agreed 
by the board or something MS is immediately going to do. It's just an idea and 
one that I like.

There is clearly a lot of work to do just fleshing out options and trying 
things.

There is an alternative, which is to just give up on scaling. That works, but 
it means OSM fractures in to multiple datasets and I envisage OSM becoming the 
debian of maps and someone else (there are several candidates) becoming the 
Canonical or Ubuntu. I don't much like that scenario, but it's there as a 
possibility.

So, what do you think? And if you agree it's worth doing, how do we achieve it 
either as individuals or the board or companies supporting it?

PS if it looks weird that I respond to certain emails and not others then 
that's because messages to, from or cc some of the trolls are automatically 
deleted and I don't see them. So even if you just cc them, I won't see your 
email. I highly recommend doing this.

Steve

stevecoast.com
_______________________________________________
dev mailing list
[email protected]
http://lists.openstreetmap.org/listinfo/dev

Reply via email to