Re: [OSM-dev] New database server
Andy, And what I can tell you is that people in OSM (I mean admins here) are very supportive and open to changes. I'm glad to hear that! Many people say they have a different experience, but we try to be helpful. Well, I don't know about other people, but I got everything I needed (and much more than I imagined!) in terms of hardware, day-to-day support etc. In fact, I am a bit ashamed that given all the admin support I still have not managed to finish what I set out to accomplish! :/ So yeah, I can say that again - for people who want to do stuff, sky is the limit in OSM ;-) Sadly, I simply don't have enough time to finish it to the extent that it is acceptable (to me, not to mention others/admins) to consider it for production use. I think many of our software projects have similar issues, and one of the key aspects is to try to get more than one volunteer to develop each one. I'm currently working on improving the documentation for the rails port (with a view to making it easier for new developers to get started). That's a great initiative. Honestly, I have no idea where to best invest time to raise the probability of attracting new contributors. At the time when I was briefly attending EWG weekly sessions I really thought about this a lot. What makes people decide they want to spend their free time on a project? I reached some crazy conclusions that it may even be that documentation is not a blocking issue - if someone really wants to contribute, they will find their way. Where to get more people like that? No idea. Perhaps the project itself needs to be more sexy? I know that for some developers (myself included) end users are one of the most important aspects - i.e. they would rather develop for a project that is actually used in the wild (of course I'm not saying that OSM isn't used but certainly there are some things to be improved?). When I finish with that, do you think that your OWL project could benefit from something similar? Or are there other things that you think are holding back other developers from getting started? Ehh... OWL is a tricky topic. In the past few months there were a couple of people who seemed interested in contributing but I think they got stuck at the setup stage and/or understanding the code. I really need to try and simplify some of the implementation, document it better etc. Perhaps I should consider narrowing the scope for OWL, right now I think I may be trying to tackle too many issues at once - and that's why the amount of code and its complexity is growing and also why I am not able to deliver anything. Currently I am thinking about making OWL a bit leaner and perhaps separating some of the stuff (like vector tiles and integration with client-side rendering) to other (future) projects. I think in OSM we will (slowly but surely!) get to the tipping point and more contributors will come. Paweł ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
On Mon, May 20, 2013 at 09:32:30PM -0400, Jason Remillard wrote: Hi, schema). This would allow the site to scale more incrementally, and potentially scale to larger loads than putting all of our eggs into two monster servers. For the money we planning on spending on the big server, we could get could get several of these smaller edge servers with flash disks and a less expensive redundant write/history database server. As we need to scale, we can do it in 3,000 dollar increments rather 40,000 dollars increments. Having a server with 29 disks, does not seem like a good situation. Just a thought. Without beeing involved in the server issues my guess is that not the amount of API calls but the working set is the problem here. We are talking about a multi terabyte working set. Grabbing data out of this working set is a very tedious task. You might want to grab 10 bytes in the front - 20 in the middle and 60 at the end. For this you need to walk through multiple gigabytes of indexes and move your disk heads like 1750 times. Not looking at SSDs the number of spindles is the concurrency you can get from this. The more heads - the more concurrent accesses. SSDs will help accelerate indexes but today are not a solution for the full database comparing € or $. Flo -- Florian Lohoff f...@zz.de signature.asc Description: Digital signature ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
On 21 May 2013 02:32, Jason Remillard remillard.ja...@gmail.com wrote: The server that we are planning on purchasing is monster. Very complicated and expensive. I am concerned that this might not be the best way to go. Indeed, it might not be the best way to go, and any thoughts and brainpower applied to the problem are always very welcome. Of course, at OWG we *do* think it's the best approach, given all the trade-offs involved, and it's a decision that hasn't been undertaken lightly. We have a google summer of code proposal to write an edge proxy server for the OSM API. I don't know if the project will be accepted, but it It's worth bearing in mind that we don't have any places on this year's GSoC, and even if OSGeo decided to go for this project on our behalf, it could be months before we even have any idea whether or not the implementation is feasible. I don't mean to be negative, but weighing up the here and now requirements against a hypothetical alternative at some point in the future is one of these trade-offs that we routinely have to make at OWG. For the money we planning on spending on the big server, we could get could get several of these smaller edge servers with flash disks and a less expensive redundant write/history database server. Well, we could certainly spend the money on small edge servers, but it's not clear to me why you think that would make the central server less expensive. I think this proposal may be worthwhile but it's somewhat orthogonal to the goals of the new server. At the moment we have two osm-database-class machines, the older of which (smaug) is no longer capable of handling the full load on its own, but is still useful as a database-level read slave. The newer machine (ramoth) can handle the load entirely on its own, but is approaching the limits of dealing with the full read+write load. When it comes to the master database, we need certain characteristics: A) To be able to handle the write load (and the associated reads involved in checking constraints etc) B) To be able to store the entire database C) To be more than one of these machines, for failover Smaug most likely doesn't fulfil A, and so currently we don't really fulfil C. So we need a new machine that can do A+B, and these are unfortunately expensive. In order to last more than 6 months, the new machine also needs plenty of space (B) on fast disks (A) which is where most of the money goes. Having map-call-proxies, as you discuss, doesn't solve any of A, B or C for the master database. Sharing out the read-only load is a good idea, but it's not clear to me whether it is better done with postgres-level database replication (as we have been doing), proxy-level replication (as per this GSoC idea), or even just examining logs and ban-hammering people scraping the map call (my personal favourite!). As we need to scale It's best in these conversations to be precise about what we mean by to scale. Scaling read-requests is only one aspect, and we have a variety of feasible (and active) options. Long-term, we may[1] need to work around the need for all these machines to store the entire database (B), and that's Hard. We may[2] also need to figure out how to solve A, and that's Hard too. Like I said at the start, thoughts and brainpower are always welcome! Cheers, Andy [1] If we grow osm faster than Moore's law, otherwise: happy days [2] If db-write activity outpaces disk-io and/or network bandwidth increases, otherwise: happy days ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
Jason, You're not at all wrong about the issues with the server design. This is something that's been well known and understood for several years: As the project grows, the cost of scaling on a single system will not scale accordingly. What I mean by that is, that it's not a linerar cost to buy a single machine with linear scaling. So if you are growing, it makes more economical (and technical) sense to scale away, rather than building up. What would this mean in the context of OSM? It might mean something like moving the GPX data off the main database. Or maybe having historical data on a slower database than the current data. It also includes things like aggressive caching and uausing tiled map calls (something that Ian and I worked on, and Ian has a new implementation of). And there's room for more optimizations even then, but just these would make an impact. So why doesn't this happen? Frankly, because I think the project doesn't have anyone who can act in the kind of technical leadership role this would require. Making these kinds of changes would require modifying (and testing) the rails port, as well as possibly modifying cgimap (depending on which calls were effected), and the database, and setting up the new hardware, and coordinating with whatever hosting situation that would be in, etc. It's not something anyone can do, with the possible exception of the sys-admins (who are both extremely overworked and volunteer). This is why the org needs a structural change, to give someone the authority and resources to oversee projects like this. Without this, the OWG is stuck ordering more hardware. - Serge ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
Serge Wroclawski wrote: So why doesn't this happen? Frankly, because I think the project doesn't have anyone who can act in the kind of technical leadership role this would require. Define can. The project has plenty of people capable of doing this. But IME the main barrier between capable of doing and actually doing is the amount of shit you have to suffer from armchair experts whenever you try to actually do anything in OSM. Richard -- View this message in context: http://gis.19327.n5.nabble.com/New-database-server-tp5761947p5762058.html Sent from the Developer Discussion mailing list archive at Nabble.com. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
Hi, On 05/21/13 16:08, Serge Wroclawski wrote: So why doesn't this happen? Frankly, because I think the project doesn't have anyone who can act in the kind of technical leadership role this would require. Knee-jerk call for authority? Never worked well. It's not something anyone can do, with the possible exception of the sys-admins (who are both extremely overworked and volunteer). It is not something that someone can do *on their own*, and I think that is not so bad. To make a change in OSM you need to work together with others, make an argument for your vision over a longer time, get buy-in from a larger group of people, and eventually things will move. I wouldn't want to sacrifice that careful, evolutionary process in exchange for a visionary leader where we all just do what he says. This is why the org needs a structural change, to give someone the authority and resources to oversee projects like this. I don't think we want or need authority figures who are somehow exempt from having to *convince* us that their idea is good. Bye Frederik -- Frederik Ramm ## eMail frede...@remote.org ## N49°00'09 E008°23'33 ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
I normally stay out of the tech bike-shedding discussions, however I do want to point out - we are aeons away from requiring and running cutting/bleeding edge hardware (and having to pay for such) - in the grand scheme of things we are not spending a lot of money on hardware (on the one hand our sys admins and the OWG are very frugal and on the other see the first point) - the amount of money we spend is a lot of money for the foundation, at least relative to our other spending, however it is extremely unlikely that we could away with spending less regardless of implementation (distributed, 3rd party cloud etc etc etc). - our current setup is fairly straightforward, fancier schemes are very likely to be more error prone with the associated costs (manpower) All this said, I would recommend that anybody who actually wants to help should participate in the OWG and help with the other tech tasks that we have in abundance. Simon ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
Can someone add more information about when and how the OWG meets? The wiki page is pretty bare on how one might help: http://www.osmfoundation.org/wiki/Operations_Working_Group On Tue, May 21, 2013 at 10:07 AM, Simon Poole si...@poole.ch wrote: I normally stay out of the tech bike-shedding discussions, however I do want to point out - we are aeons away from requiring and running cutting/bleeding edge hardware (and having to pay for such) - in the grand scheme of things we are not spending a lot of money on hardware (on the one hand our sys admins and the OWG are very frugal and on the other see the first point) - the amount of money we spend is a lot of money for the foundation, at least relative to our other spending, however it is extremely unlikely that we could away with spending less regardless of implementation (distributed, 3rd party cloud etc etc etc). - our current setup is fairly straightforward, fancier schemes are very likely to be more error prone with the associated costs (manpower) All this said, I would recommend that anybody who actually wants to help should participate in the OWG and help with the other tech tasks that we have in abundance. Simon ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
On 05/21/2013 04:08 PM, Serge Wroclawski wrote: This is why the org needs a structural change, to give someone the authority and resources to oversee projects like this. Without this, the OWG is stuck ordering more hardware. Reluctant +1 from me though I am sure this is not a popular view (we don't want authority and all that)... I can only say from my own experience what I was/am trying to do with OWL/history tab/whatever you call it - in itself not a trivial task but on the other hand the challenges with the overall architecture (both logical and physical) are at least at the same level (if not above!) of complexity. And what I can tell you is that people in OSM (I mean admins here) are very supportive and open to changes. I love this in this project that I did something and I had/have the opportunity to get it into production. Sadly, I simply don't have enough time to finish it to the extent that it is acceptable (to me, not to mention others/admins) to consider it for production use. So I think your statement is 100% true but I think the main problem is that there are no people or/and no money to do this. Think if you were the person that would need/want to do this work. It most definitely *is* a full time job for at least a few people - service like OSM doesn't run on rainbows and good wishes. And *changing* how it runs, i.e. changing the fundamental architecture or introducing some kind of load balancing or whatever is an *enormous* task. In my opinion it is clear that this kind of work will not be done by volunteers - it is just too much interconnected stuff that needs to be handled properly, not to mention operations around all of it. How to handle that - I have no idea... other than paying people - that would be the obvious (and IMHO surefire) solution. Of course the other question is whether there even are people (among current admins ideally?) who are willing to sacrifice (part of) their professional careers to work for some time on OSM - not sure. But I think at this stage and level of complexity as OSM has right now there is only one solution - as you said - structural change. I would put it more bluntly - right people + money for their time is the only way forward. Paweł ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
On 21 May 2013 18:50, Paweł Paprota ppa...@fastmail.fm wrote: And what I can tell you is that people in OSM (I mean admins here) are very supportive and open to changes. I'm glad to hear that! Many people say they have a different experience, but we try to be helpful. Sadly, I simply don't have enough time to finish it to the extent that it is acceptable (to me, not to mention others/admins) to consider it for production use. I think many of our software projects have similar issues, and one of the key aspects is to try to get more than one volunteer to develop each one. I'm currently working on improving the documentation for the rails port (with a view to making it easier for new developers to get started). When I finish with that, do you think that your OWL project could benefit from something similar? Or are there other things that you think are holding back other developers from getting started? Cheers, Andy ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev
Re: [OSM-dev] New database server
Hi Andy, Thank you for the detailed reply. I have no issues with how the money is being handled, nor spending this kind of money on hardware. OSM growth is pretty amazing right now. Over the lifetime of this server we should be planning for at least 15x/20x more traffic. If this server can handle 15x/20x more traffic than the current site generates, then we are good, case closed. If we can't get to 15x/20x traffic levels without large architectural changes, we should start steering ourselves into a new architecture now. Standard proxy servers will not work well with our API, it will be painful if we are not proactive about it. It seems like some kind of geographically distributed cached edge server will eventually be needed. The OWG obviously knows all of this. Really the bottom line of my email is as follows: The Google Summer Of Code project (if it moves forward) will be not that hard to integrate into the current server infrastructure and it has a reasonable chance of being stood up by this fall. I just wanted to insure that the OWG was aware of the project and considered it in the resource planning for our new servers. Concretely, this means we have the option of configuring the new database server for only history and write requests rather than all of the API. Thanks Jason. On Tue, May 21, 2013 at 5:49 AM, Andy Allan gravityst...@gmail.com wrote: On 21 May 2013 02:32, Jason Remillard remillard.ja...@gmail.com wrote: The server that we are planning on purchasing is monster. Very complicated and expensive. I am concerned that this might not be the best way to go. Indeed, it might not be the best way to go, and any thoughts and brainpower applied to the problem are always very welcome. Of course, at OWG we *do* think it's the best approach, given all the trade-offs involved, and it's a decision that hasn't been undertaken lightly. We have a google summer of code proposal to write an edge proxy server for the OSM API. I don't know if the project will be accepted, but it It's worth bearing in mind that we don't have any places on this year's GSoC, and even if OSGeo decided to go for this project on our behalf, it could be months before we even have any idea whether or not the implementation is feasible. I don't mean to be negative, but weighing up the here and now requirements against a hypothetical alternative at some point in the future is one of these trade-offs that we routinely have to make at OWG. For the money we planning on spending on the big server, we could get could get several of these smaller edge servers with flash disks and a less expensive redundant write/history database server. Well, we could certainly spend the money on small edge servers, but it's not clear to me why you think that would make the central server less expensive. I think this proposal may be worthwhile but it's somewhat orthogonal to the goals of the new server. At the moment we have two osm-database-class machines, the older of which (smaug) is no longer capable of handling the full load on its own, but is still useful as a database-level read slave. The newer machine (ramoth) can handle the load entirely on its own, but is approaching the limits of dealing with the full read+write load. When it comes to the master database, we need certain characteristics: A) To be able to handle the write load (and the associated reads involved in checking constraints etc) B) To be able to store the entire database C) To be more than one of these machines, for failover Smaug most likely doesn't fulfil A, and so currently we don't really fulfil C. So we need a new machine that can do A+B, and these are unfortunately expensive. In order to last more than 6 months, the new machine also needs plenty of space (B) on fast disks (A) which is where most of the money goes. Having map-call-proxies, as you discuss, doesn't solve any of A, B or C for the master database. Sharing out the read-only load is a good idea, but it's not clear to me whether it is better done with postgres-level database replication (as we have been doing), proxy-level replication (as per this GSoC idea), or even just examining logs and ban-hammering people scraping the map call (my personal favourite!). As we need to scale It's best in these conversations to be precise about what we mean by to scale. Scaling read-requests is only one aspect, and we have a variety of feasible (and active) options. Long-term, we may[1] need to work around the need for all these machines to store the entire database (B), and that's Hard. We may[2] also need to figure out how to solve A, and that's Hard too. Like I said at the start, thoughts and brainpower are always welcome! Cheers, Andy [1] If we grow osm faster than Moore's law, otherwise: happy days [2] If db-write activity outpaces disk-io and/or network bandwidth increases, otherwise: happy days ___ dev
[OSM-dev] New database server
Hi, I could not find a discussion on the new database server. http://wiki.openstreetmap.org/wiki/New_server_and_fund_raising_drive_2013 The server that we are planning on purchasing is monster. Very complicated and expensive. I am concerned that this might not be the best way to go. We have a google summer of code proposal to write an edge proxy server for the OSM API. I don't know if the project will be accepted, but it has got me thinking about the approach and our funding drive. The idea is that each front facing server has a local snapshot copy of the OSM database to service all of the read only calls. These edge servers could be geographically distributed. It would just leave the central database server to deal with write requests, history requests, and diffs ( anything that can't be handled with a snapshot database schema). This would allow the site to scale more incrementally, and potentially scale to larger loads than putting all of our eggs into two monster servers. For the money we planning on spending on the big server, we could get could get several of these smaller edge servers with flash disks and a less expensive redundant write/history database server. As we need to scale, we can do it in 3,000 dollar increments rather 40,000 dollars increments. Having a server with 29 disks, does not seem like a good situation. Just a thought. Thanks Jason. ___ dev mailing list dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/dev