On Jan 9, 2011, at 3:31 PM, Nic Roets wrote: > Hello Steve, > > I have ideas for a super fast API mirror with the capability of > answering historic queries i.e. contents of bbox l,t,r,b at time h. > But it can't be the main server because it ignores changesets and > other info. > > If we moved all the load that is not directly related to editing to > such a server, would it make a massive difference ? Get rid of all the > scraping and spidering ?
What you're effectively saying is move certain things to something that looks like xapi? > > http://www.google.co.za/search?q=edna+site%3Awww.openstreetmap.org&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a > > On Mon, Jan 10, 2011 at 12:27 AM, Igor Brejc <[email protected]> wrote: >> I'll play a heretic here, but my feeling is that "openness" in OSM will more >> and more come under question, and the reason is scaling. Yes, OSM can >> proclaim the access to its data is open, but in reality only someone (or >> better some organization/company) with enough HW resources to be able to >> process planetary OSM data can actually make use of it. >> >> In reality most potential users of OSM data don't really need global data, >> they want easy access to OSM data that's local to them. And that's what OSM >> infrastructure does not provide. XAPI and CloudMade and Geofabrik extracts >> are just poor workarounds (that's not to say that they aren't valuable). >> >> A simple example: if I need OSM data for European highways, why do I need to >> process the whole planet file? Europe is physically separated from America >> and I see very few reasons for having to share OSM data across continents in >> a single planetary file. >> >> Separating data in vertical layers could help too: country borders certainly >> belong to a completely different level than, say, park benches. And they >> change a bit less often. Why keep them in the same store? >> >> Igor >> >> On 9.1.2011 22:38, SteveC wrote: >>> >>> The amusing recent FakeSteveC ... I guess I will call it a LOLSCALE got me >>> thinking about what people actually think of the boards comment on scaling; >>> >>> http://fakestevec.blogspot.com/2011/01/know-your-osm-memes-2.html >>> >>> As much as I want a dialogue with my fake self, a discourse on the thrust >>> of the argument is I think merited. >>> >>> I think scaling is the number one issue OSM should tackle technically. >>> >>> The days of just 'buy a bigger database server' are I think over. It's not >>> very elegant and it's just too damn expensive. Perhaps we could do another >>> iteration, but if OSM bandwidth continues to outpace moore's law and >>> donations then it just doesn't work. >>> >>> So that means scaling horizontally to more than one machine. And if you're >>> doing that, you may as well do more than 2 machines, or more than 20, or >>> whatever figure you have in your head. >>> >>> I think this is number one because I think the amount of data OSM is going >>> to have to deal with is going to explode in a fairly short time scale. I >>> don't mean just another big import. Sadly I can't be public but I had a >>> conversation with a large company over a year ago (no, it's not MS or CM) >>> who speculated about putting OSM on the front page of their maps product, >>> which would approximately turn all of our yearly statistics to daily or >>> weekly numbers. We went through a decision tree about how that could happen. >>> Every leaf node on that tree came back as basically we couldn't do it. >>> >>> Could we accept the edit traffic? No, far too much. Could we provide a >>> good user experience, clearly no. Could they help us scale? No they would be >>> viewed as taking over on any kind of timescale they needed. Could they host >>> us? Again no, it would be too slow of a process and it'd be a takeover and >>> the community would probably reject it. >>> >>> I could continue, but the basic direction you can imagine. Imagine you had >>> millions of daily users and you wanted to use OSM in a respectable >>> community-driven community way. And let's say you get over the 4chan >>> rhetoric over on t...@. If you think through it, within any reasonable time >>> frame (like 6-12 months) it's very hard to make that happen, and so you may >>> as well go build your own things. Which I think sucks and is a loss for OSM. >>> >>> Now this conversation has come up a few more times recently with other >>> large mapping companies. And I feel like I'm rehashing those conversations >>> above. I'd love to be public about it, but those companies aren't ready to >>> talk yet. >>> >>> Even if people weren't privately proposing notching up our traffic a few >>> orders of magnitude, it would still make a lot of sense to figure out how to >>> scale. >>> >>> Back to FakeSteveC and the negative eye-rolling comment on thinking about >>> this for a few seconds. Well it turns out we have. The board specifically >>> didn't list any technical measure on purpose, that's not it's job. But the >>> direction of supporting and encouraging basic things like scaling is I think >>> well within the bounds. >>> >>> I haven't a clue what we should use to scale horizontally. There are a few >>> major architectural choices and then within those there are lots of >>> implementations. Some are too new and buggy, some are in the wrong language >>> ... it's clearly a bit of a mess out there right now. There are also a bunch >>> of religious beliefs around how you do this stuff too. >>> >>> So, how do we get from here to there? Speaking strictly personally, I >>> think one of the best uses of funds in or out of OSM has been bug bounties. >>> Personally, I think putting up some bounties on demoing either architectures >>> or implementations is a good idea, because we all know it comes down to >>> working code. Something like "$1,000 to the first person who demonstrates >>> OSMs DB running on more than one machine" then another $1000 for proving it >>> can handle a certain throughput and so on is one way to get there. That's >>> the way personally I'd like to encourage it to happen, but that's neither >>> been agreed by the board or something MS is immediately going to do. It's >>> just an idea and one that I like. >>> >>> There is clearly a lot of work to do just fleshing out options and trying >>> things. >>> >>> There is an alternative, which is to just give up on scaling. That works, >>> but it means OSM fractures in to multiple datasets and I envisage OSM >>> becoming the debian of maps and someone else (there are several candidates) >>> becoming the Canonical or Ubuntu. I don't much like that scenario, but it's >>> there as a possibility. >>> >>> So, what do you think? And if you agree it's worth doing, how do we >>> achieve it either as individuals or the board or companies supporting it? >>> >>> PS if it looks weird that I respond to certain emails and not others then >>> that's because messages to, from or cc some of the trolls are automatically >>> deleted and I don't see them. So even if you just cc them, I won't see your >>> email. I highly recommend doing this. >>> >>> Steve >>> >>> stevecoast.com >>> _______________________________________________ >>> dev mailing list >>> [email protected] >>> http://lists.openstreetmap.org/listinfo/dev >> >> >> -- >> http://igorbrejc.net >> >> >> _______________________________________________ >> dev mailing list >> [email protected] >> http://lists.openstreetmap.org/listinfo/dev >> > Steve stevecoast.com _______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

