Great, this is a good discussion. I've put up a wiki page with some of the things we've covered, with pros/cons. I hope we can continue to talk about our approaches and as we optimize for different problems post some of it back up here: http://code.google.com/p/cartagen/wiki/FeatureTradeoff
I put in what I could gather about Temap, but feel free to update and add more pros and cons... this is just my thought process so far. We might also add a "status" column so we can annotate what we learn from each approach. Best, Jeff On Fri, May 8, 2009 at 3:00 PM, Tels <[email protected]> wrote: > Moin, > > On Friday 08 May 2009 20:04:48 you wrote: > > > * The proxy receives XML from the api or xapi server. Currently it > > > requests the full dataset. > > > * Then it removes unnec. tags (like note, fixme, attribution and a > > > whole bunch of others that are not needed for rendering). Some of > > > them are very minor, but 10000 nodes with "attribution=veryvery > > > long string here" can make up like 40% of all the data, and just > > > clog the line and browser :) > > > > Yes, I'm thinking of trying to cache locally but still request > > changesets if the ?live=true tag is set... caching locally is great > > for more static data but for the live viewer, I'm trying to not use > > caching, but increase efficiency in the requests. > > I fear loading data live from the API server is just not feasible, > unless you: > > * only load diffs (minute-diffs?) and update your already cached at the > proxy data with that. OTOH I read that importing a one-hour diff into a > postgres database can take 40..70 minutes, e.g. depending on load you > might not even manage to update your DB with the diffs fast enough... > * invent an API server that is about 1000 times faster :) > * do never zoom out from level 18, anything below will request so much > data that you can't get it live :) > > Currently I consider "live-view" not an achiveable goal, I am happy if I > can render data that is about 1 day or so old. > > > * The data is then pruned into (currently 3) levels and stored in a > > > > > cache: > > > * level 0 - full > > > * level 1 - no POI, no paths, streams, tracks etc. used for zoom > > > 11 * level 2 - no tertiary roads etc. used for zoom 10 and below * > > > The client is served the level it currently requested as JSON.gz. > > > > Great, this is what I'm working on too. I'm thinking a ruleset about > > what features are relevant for what zoom levels could be something to > > work together on? I was also thinking of correlating tags with a > > certain zoom level. But maybe each tag should be associated with a > > range of zoom levels, like "way: { zoom_outer: 3, zoom_inner: 1 }". > > Thoughts? > > My rules do have a minimum zoom level, smaller than that and they are > not rendered. The levels are inspired by the osmarenderer and mapnik > outputs, but I moved a few of them down so you can render really high > resolution maps. > > However, the pruning at the proxy is something else and not connected to > that. For instance, somebody might not want to see tertiary roads on > level 13, but others want. So I make sure that I only prune out data > that is never be able too seen on that level. E.g. a conservative > pruning. > > Also, about 90% of the data-pruning is about removing unwanted data > (like "note=blah" :) and not about the smaller zoom levels because > currently it is simple not feasible to render below 10 and even for > zoom 10 you need a really really beefy machine and a long wait time.... > > > > * There are three servers in the list (api.openstreetmap, > > > xapi.informationfreeway and tagwatch) and a lot of them do not > > > complete the request (internal error, not implemented etc. etc.). > > > It can take a lot of retries to finally get the data. > > > * Even when you get the data, it takes seconds (10..40 seconds > > > is "normal") to minutes - upwards to 360 seconds just to serve one > > > request. > > > > > > So currently all received data is stored in the cache for 7 days to > > > avoid the very very long loading times. > > > > > > Ideas of fetching the full dataset and pre-computing the cache > > > simple don't work because I don't have a big enough machine and no > > > big enough online account to store the resulting JSON :( > > > > > > > > > > > > Also, somehow processing 150 Gbyte XML into JSON will prove to be a > > > challange :) > > > > So I'm having the same problems with the APIs. The standard 0.6 api > > has been pretty good but of course it serves XML, not JSON. The xapi > > is not very responsive to me, it seems. > > Neither for me, but the API server is very slow, too. It seems it can't > manage to send me more than 17Kbyte/s (but maybe it is bandwidth > limited?). > > > I thought parsing XML in JS > > would be molasses, > > When I tried it, it used ungodly amounts of memory (because the data > structure is not usefull for rendering and it contains so much cruft), > and I also never managed to extract the actual node data for rendering > from it... > > > so if you're interested, we should put up our own > > XAPI or custom api off the planet.osm file, and send JSON? > > Yeah, that was my plan for the near future :) For now I am happy with my > proxy as there is quite enough to do on the client side wether the data > is current/real-time or 1 day old :) > > > I have an quad-core Intel Mac Pro with 1.5 TB and a bunch of RAM we > > can dedicate to this effort, with plenty of bandwidth. And perhaps > > when Stefan's work is published, we could run it as well, since it > > seems to be a great solution to requesting fewer nodes for large > > ways... but for now do you think you could use an XAPI? i think all > > my requests fit into that api. > > Currently I am just requesting the full data and then prune it myself > because I am not actually sure it would help if we do either: > > * request partial data (all streets, all landuse), simple because at one > zoom level high enough you need the full data, anyway > * request ways with less nodes, because that is only good for low zooms > and I am currently sort of ignoring them :) It is basically a > side-problem. > > > Alternatively, Stefan points out that the dbslayer patch for the > > Cherokee server allows direct JSON requests to a database. So some > > very thin db wrapper might serve us for now? This isn't my area of > > expertise, so if you have better ideas on how to generate JSON direct > > from the db, like GeoServer or something, and still have tag-based > > requests, i'm all ears. > > Well, I am not sure that this would be faster or better. If the db-json > would serve the full API data, we would also get all the "junk" data > like "note" and so on, and this will overhelm the browser. So it might > need a filter, too. > > Also, my renderer expects the format currently spewed by my proxy. If we > use stevens format, it wouldn't work (multipolygons are one reason) and > it would be a lot of work to switch the code. > > OTOH; I would not complain if somebody invents a server that spits out > JSON in the right format and in real-time :) > > > Yes, but reducing the polygons is also a lot of work :) I haven't > > > started on this yet, because on zoom 12 or higher you need to > > > render almost anything, anyways. Plus, you would then need to cache > > > thepartial data somehow (computing it is expensive in JS..) > > > > Seems like Stefan's work may address this, no? Or if we did cache it, > > seems like we'd calculate it on the server side. > > I was kinda hoping that I build a client-side aplication, not something > that runs on the server :) If the server has to reduce the polygons, it > might never be able to process the whole planet. > > But I see the point. :) > > (I was f.i. pondering if the JSON from the server should already contain > BBOX data for each way. Decided against it as it: uses bandwitdh and > server CPU, and isvery fast to compute on the client, anyway. But > definitely a few things can be precomputed at the server and stored in > the cache. One example are the multipolygon relationships. The > presentation in XML isn't actually very useable, so I just rewrite it > that the client can access it super-fast). > > > > > d) oh, and localStorage. I've partially implemented that but > > > > haven't had much testing... other work... ugh. So caching on a > > > > few levels, basically. > > > > > > I fail to see what localstorage actually gains, as the delivered > > > JSON is put into the browser cache, anyway and the rest is cached > > > in memory. Could you maybe explain what your idea was? > > > > Yes, localStorage persists across sessions so you could build up a > > permanent local cache and have more control (in JS) over requesting > > it and timestamping when you cached it, not to mention applying only > > changesets and not complete cache flushes. This has some advantages > > over the browser cache, although that does of course persist across > > sessions too. > > But it won't help if you move to a different machine. Also, it goes > against the "live", we would need to query the server for new data, > anyway. Currently, if you reload a temap-session, most of the time is > spent in the rerender, and almost none in loading the data over the > net. > > I guess if I write a 100x faster renderer, that might change, but I'd > like to work on one problem at a time :) > > So for now I'd like to keep localstore out as it creates more problems > than it solves :) > > > > * There is a talk I proposed for State of the Map and I don't want > > > to spoil everything before :) > > > > yes, me too! so if you want to discuss off-list that's fine. > > Heh, you have a talk scheduled, too? :) That sounds like fun :) > > > Of course, semi-dynamic rules like "color them according to feature X > > by > > > formula Y" are still useful and fun, and avoid the problems above. > > > (Like: "use maxspeed as the color index ranging from red over green > > > to yellow" :). > > > > Yes, this is an exciting area to me, for example the color by > > authorship stylesheet i posted before: > > > > http://map.cartagen.org/find?id=paris&gss=http://unterbahn.com/cartag > >en/authors.gss > > > > or this one i threw together yesterday, based on the tags of measured > > width instead of on a width rule: > > > > http://map.cartagen.org?gss=http://unterbahn.com/cartagen/width.gss > > > > A more fully-rendered screenshot is here: > > > > http://www.flickr.com/photos/jeffreywarren/3510685883/ > > Yeah, that is what I have in mind, too. But so many things to do, so > little time :) > > > Anyways, thanks for sharing; one thought I had was that besides > > sharing ideas and solutions online, we should try *different* > > approaches, so that we try all the possibilities. I think multiple > > projects working on the same problem can sometimes be redundant, but > > more often it's beneficial for all parties since there's a diversity > > of approaches to a problem. Let's take advantage of that by > > specifically attempting different solutions to the problems we face, > > and discussing the results... if you're willing. If one of us tries a > > technique and it doesn't work, we can all learn from the attempt. > > Sure, I am working on my ideas, anyway :) A few things you might find > interesing: > > * no dashed lines on canvas, need to roll your own > * rendering 60000 lines/areas takes a long time (>1minute), which means > you need a sort of "slippy tiles" setup like I have currently. That > allows the user to pan the map in real-time and the renderer can only > render tiles off-screen. > > All the best, > > Tels > > -- > Signed on Fri May 8 20:47:12 2009 with key 0x93B84C15. > Get one of my photo posters: http://bloodgate.com/posters > PGP key on http://bloodgate.com/tels.asc or per email. > > "If Duke Nukem Forever is not out in 2001, something's very wrong." > > -- George Broussard, 2001 (http://tinyurl.com/6m8nh) >
_______________________________________________ dev mailing list [email protected] http://lists.openstreetmap.org/listinfo/dev

