Hi Scott, Scott Shawcroft wrote: > Stefan de Konink wrote: >> - Admins don't want to maintain multiple systems >> - The fear of anything new not developed by the devs (especially if it >> is not build in Ruby) > Who are the admins for the systems?
Tom Hughes is a factor to take in account. All your base... > We're open to particular solutions > and if there is a bias towards Ruby we'd look closer at it. However, it > may be that there is a better solution. Who are the designated devs? That is basically a 'free for all'. Read the history of SVN and/or this list to find out which people are working on OSM. Personally I am working on a C implementation of the API. Other people tend to work on the official RubyOnRails one. > Also, Amazon WebServices could be used to have virtual machines instead > of real ones which need maintenance. If Amazon wants to sponsor OSM, that is a great thing ;) >> Technical problems might be more interesting: >> >> - Synchronization issues, even for a proxy solution; single or >> multiple write databases should distribute their data. Out of sync >> scenarios etc. >> - Especially geo related issues, how to distribute a real geoquery. > Totally, synchronization is important. Simple partitioning wouldn't > have this problem but if multiple copies will be shared then we could > get into trouble. > > I think the geo element is what makes this more interesting than the > standard data storage issue. The main point is that OSM by design in not a GIS database, we can make it one, but the current features approach the dataset in a 'traditional' way, this is not bad perse, though some problems would tend to love GIS solutions. >>> We're interested in trying our hand at creating a better system for >>> storing OSM data. We're interested in what kind of computing >>> resources to design for (how many machines) and whether we can get >>> access logs in order to test our implementation against. >> >> Related to accesslogs I found a long brick wall, it might be a better >> thing to use a requester that just makes random requests. Sources are >> available for that. > Well, randomness is probably not the best model. I imagine that the > server's traffic patterns are also geo related. For example, people are > more likely to work on areas they are near and areas on the earth in > daylight or evening are more likely to have those people accessing the > site. Or perhaps a mapping party has a number of people working on the > same area all at once. A simple geo partitioning would drive all of > this traffic to one particular server. This simple access does work > better when retrieving data because it will utilize all the different > machines. Like Erik pointed out, diffs will give you writes. I think reads are more interesting. >>> Also, we'd love to have OSM community members involved since we're >>> new to the organization. >>> >>> Lastly, I think we plan to donate our code to the community with the >>> hope that it is useful. >>> >>> What do you think? >> >> I love to brainstorm with you :) The next month I want to spend on my >> MSc thesis about improving native geospatial support in MonetDB. And >> the OSM data in it. It would ofcourse be great if the ideas comming >> out of such session can make it to State of the Map 2009. >> >> It would be good to point you at DBslayer (the standard implementation >> or the Cherokee one), it will balance requests but with a better >> balancer could do geobalancing too :) > I'll have to take a look at it. Existing solutions are good but we are > really looking at laying down some code too I think. Creating for example a specific SQL based scheduler that can handle partitions was a thing I was thinking about in the night: http://code.google.com/p/cherokee/issues/detail?id=328 Stefan _______________________________________________ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev