Frederik Ramm wrote: > Hi, > > (f'up set to osmosis-dev) > > Karl Newman wrote: > >> Anyway, the tee can choke things up with all the temporary files. It would >> be nice to be able to share the stored node and ways files between tee >> tasks, but I haven't created that infrastructure yet. >> > > It would be even better to have an extended --bp task that somehow takes > a list of disjoint polygons and uses some kind of point location > algorithm to determine which node belongs to which polygon. The > rationale being of course that with the classic --bp/--tee approach, > each node is duplicated n times and tested against each of the polygons > which is a waste of time, especially with a large input file and many > polygons (e.g. split up the US into counties or so). > Just to be clear, the --tee task doesn't duplicate nodes. It simply passes nodes to multiple downstream tasks. The problem with the --bp task is that when it is used downstream of a --tee task each instance persists the node information.
This exact problem is why I originally created the customdb tasks which aimed to create a single random access dataset (with appropriate indexing) which could be built once then queried many times for many bounding boxes. When that provided dismal performance I created the pgsql tasks instead. There is a --dataset-bounding-box task which can be used to read a bbox from a database. osmosis --read-pgsql --data-bounding-box left=xx ..... --write-xml myextract.osm I've been distracted by other things recently so haven't spent much time on the bounding box implementation for a while. I've been meaning to load up a full pgsql db to see how it performs for tile cutting. > Does the task and stream model that osmosis uses theoretically support > tasks where the number of output streams they create is not fixed, but > dependent on their parameters? So that e.g. a "bp file=a.poly > file=b.poly" (or "bp files=a.poly,b.poly") creates two entity streams > and so on? > Hmm, perhaps this is a better way to do it. I hadn't thought of keeping a single copy of nodes with references to the polygons they reside in. If somebody can come up with a faster implementation than pgsql I'll be ecstatic. I've wasted a lot of time on this one. Note that the pgsql (and customdb) implementations solve the problem of ways crossing bounding boxes without having nodes in them which might be difficult to solve using an alternative solution. Yes, osmosis can support variable numbers of output streams so long as they are known at startup time. This is pretty much what the --tee task does. Brett _______________________________________________ talk mailing list talk@openstreetmap.org http://lists.openstreetmap.org/listinfo/talk