leijurv left a comment (osm2pgsql-dev/osm2pgsql#2482)

Okay, I implemented a version that uses exact endpoints, rather than tiles. 
https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482/changes/ddb7de4f121002f6f511b83e20a08a26b2372ad3
 Here is an explanation.

This generalization needs to know which connected components have been changed, 
so that it can re-merge only those. Currently in this PR, it gets expired Z/X/Y 
tiles, and so it must re-walk every road that touches that tile. This is quite 
slow due to the GiST `tile && way` issue that I have mentioned a few times so I 
won't repeat it. And of course, even if you make your tile very small like Z19, 
the vast majority of these re-walks will be useless red herring, where the tile 
was edited for some unrelated reason. And making the tiles very small is itself 
inefficient because so many of them will be expired by OSM edits. But, if we 
instead search by exact endpoint, all these issues are solved. We only search 
from exact node equality, and starting from that point, the search is limited 
to where it matches the highway,ref,name,etc columns that we group by.

A few concerns addressed

Why not do this on the Lua side? Why change core C++? The flex geometry API 
doesn't have a method to get the startpoint/endpoint of a way. Could add one. 
But then still, deleting a way is still a problem, it doesn't call process_way. 
By the time you call process_deleted_way, the exact node locations could be 
gone (?). So, doing this in Lua would need new accessors and a new deleted 
geometry hook. It is just a less natural way to hook into deleted way endpoints.

Why not expire only tiles at the endpoints of the way, not the middle? If I 
edit a long way, maybe we could expire just exactly two tiny tiles, like Z20, 
one at each endpoint? This is still not perfectly quantized, I cannot look up 
the way with `==`, I still need `&&` which has the issues mentioned before. And 
you would still need the Lua endpoint accessor still.

Why add this code to expire output rather than a totally new system? It is true 
that endpoints are not the same as tiles but they are triggered by the same 
event on the same column. It is able to reuse quite a lot of code while staying 
backwards compatible.

Summary: `define_expire_output` will write the exact startpoint,endpoint of 
each changed way as a ST_Point, the generalization consumes those points and 
proceeds through the existing steps.

Performance? I've printed several tables above, but the short version is that 
there is a noise floor around 150ms where even for a single node update where 
the expiry is tiny I can't get it faster because of all the time spent on all 
the unrelated ways and slow GiST lookups. With this version, picking single 
random ways in the same tile ([this 
one](https://tile.openstreetmap.org/13/1405/3271.png)), it takes an average of 
~68ms to fully reexplore the network. Note that this is a dense road network so 
the recursion depth here ranged from 1 to 51, and the total ways looked up 
ranged from 2 to 74. And this parallelizes quite well, if I go from 1 point to 
10 points, it only gets about 2x slower not 10x (it goes to ~125ms). If I 10x 
it again, to 100 random points, it goes to average ~407ms. So osmchanges with 
many edited nodes are not a problem even if they all are touching dense road 
networks. And note every measurement in this paragraph was completely cold - I 
restart Postgres between every query, and I flush my filesystem cache too. If I 
rerun it warm, it's almost 3x faster, to average ~26ms for 1 point.



-- 
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482#issuecomment-4637669425
You are receiving this because you are subscribed to this thread.

Message ID: <osm2pgsql-dev/osm2pgsql/pull/2482/[email protected]>
_______________________________________________
Tile-serving mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/tile-serving

Reply via email to