# Summary

This PR adds a new generalization strategy, `grouped-linemerge`, to 
`osm2pgsql-gen`. It maintains a table of merged linestrings, the equivalent of:

```sql
SELECT col1, col2, ..., (ST_Dump(ST_LineMerge(ST_Collect(geom)))).geom
FROM src
GROUP BY col1, col2, ...
WHERE condition
```

The key is that this table is global, rather than local to your tile. It is 
kept up to date incrementally as the source data changes (and it never re-scans 
the entire planet to do so).

It works similar to the existing generalizations `rivers` and `vector-union`. 
You configure it from your flex style like 
`osm2pgsql.run_gen('grouped-linemerge', {...})`, and you run `osm2pgsql-gen` 
the same way (after import, and after update).

# Motivation

The motivation is merging ways in OSM-Carto with the same rendering. Here's the 
relevant issue: 
https://github.com/openstreetmap-carto/openstreetmap-carto/issues/951

OSM ways are often extremely fragmented, only going a block or two at a time in 
urban areas, due to relations and tagging changes. This significantly impairs 
rendering (see images near the bottom of that issue). We could LineMerge these 
ways within each tile as it's rendered, but this makes it really challenging to 
have consistent geometries for long ways that span metatile boundaries.

This PR allows us to have a global LineMerge for all ways, grouped only by the 
colunms that actually affect rendering (like `name`, `ref`, `highway`, `layer`, 
etc). There are no tile-boundary artifacts, and it can be maintained 
incrementally quite efficiently.

# How it works

## Initial build

The initial build is very simple - we just run that exact query to do a grouped 
`ST_LineMerge` across the whole world. This takes a few minutes, because the 
number of ways in each group is often not so large. We also make an index on 
the start and end points of each way, this is not strictly required but it 
significantly helps performance of the incremental update.

## Incremental updates

Generalizations are able to receive the expire list of tiles whose geometry has 
changed. The expire list includes tiles where something was deleted, changed, 
or added. One possible approach here would be to select all roads in the entire 
planet that match any grouping key that was touched, and re-LineMerge them all. 
I ruled this out because some road names are very frequent. Imagine if moving a 
node on "Main Street" in some town, resulted in re-scanning every road named 
"Main Street" in the entire world. I assumed this was not an option from a 
performance perspective. Instead, I used a recursive CTE to spider through the 
geometry from the expired tile. Each changed area becomes a seed from which it 
walks outwards. To bound this walk, it must match an exact endpoint, and it 
must match every element of the grouping columns. This finds the full connected 
component, which replaces the old connected component geometry.

## Usage

```lua
-- An expire output on the source geometry feeds the incremental updates:
local exp = osm2pgsql.define_expire_output({ maxzoom = 18, table = 
'expire_roads' })
-- ... attach `expire = {{ output = exp }}` to the source table's geometry 
column ...

-- A destination table holding the grouping columns + the merged geometry
-- (no id column; it's maintained by osm2pgsql-gen):
osm2pgsql.define_table({
  name = 'coalesced_roads',
  columns = {
      { column = 'highway', type = 'text' }, { column = 'name', type = 'text' },
      { column = 'ref', type = 'text' },     { column = 'layer', type = 'int4' 
},
      -- … any other grouping columns …
      { column = 'way', type = 'linestring', not_null = true },
  }
})

function osm2pgsql.process_gen()
  osm2pgsql.run_gen('grouped-linemerge', {
      src_table        = 'planet_osm_line',
      dest_table       = 'coalesced_roads',
      geom_column      = 'way',
      group_by_columns = 'highway, name, ref, layer',  -- comma-separated
      where            = 'name IS NOT NULL OR ref IS NOT NULL', -- optional 
pre-filter
      expire_list      = 'expire_roads',  -- } required in
      zoom             = 18,              -- } append mode
  })
end
```

Then:

```sh
osm2pgsql      -O flex -S style.lua --slim … data.osm.pbf   # import (creates 
the tables)
osm2pgsql-gen          -S style.lua                         # initial build
osm2pgsql   -a -O flex -S style.lua --slim … changes.osc.gz # update
osm2pgsql-gen -a       -S style.lua                         # incremental update
```

## Details

I believe this works in general for incremental updates, but the recursive CTE 
was challenging to get correct and performant. The reason for the new index is 
so that we can look up ways by their exact endpoint match. Without the new 
index, we'd use the existing GiST to lookup the way, which means that even if 
we ask PostGIS for `ST_Intersect` or some such, it will actually mechanically 
do `&&` on the way's entire bbox. This works, to be clear, but it means that if 
I want to find a way that ends on an exact point, GiST will actually give me 
all the ways whose entire bbox covers that point, which means less than one 
percent of the ways that the index gives me are actually eligible. See here 
https://github.com/openstreetmap-carto/openstreetmap-carto/issues/951#issuecomment-4524503245
 for discussion about that.

## Tests
I put in `tests/test-gen-grouped-linemerge.cpp` a differential fuzz test. It 
performs hundreds of random connect/disconnects in a small grid of points. I 
intentionally set it up so the degree of each node could vary from 0 to 4, 
which stresses how `ST_LineMerge` does not merge nodes with a degree above 2. 
At all times, the incrementally updated geometry must exactly match what 
`ST_LineMerge` would have done.

I have run this locally and it works beautifully. Here's the commit to run it 
https://github.com/leijurv/openstreetmap-carto/commit/b79c0fa3033ed5339643eb1b9590edc7e684dd28
 and it does add a bunch of labels (looks like this 
https://github.com/openstreetmap-carto/openstreetmap-carto/issues/951#issuecomment-4532110979)
You can view, comment on, or merge this pull request online at:

  https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482

-- Commit Summary --

  * grouped linemerge

-- File Changes --

    M CMakeLists.txt (1)
    A flex-config/gen/grouped-linemerge.lua (106)
    M src/gen/gen-create.cpp (5)
    A src/gen/gen-grouped-linemerge.cpp (358)
    A src/gen/gen-grouped-linemerge.hpp (67)
    M tests/CMakeLists.txt (9)
    A tests/test-gen-grouped-linemerge.cpp (328)

-- Patch Links --

https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482.patch
https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482.diff

-- 
Reply to this email directly or view it on GitHub:
https://github.com/osm2pgsql-dev/osm2pgsql/pull/2482
You are receiving this because you are subscribed to this thread.

Message ID: <osm2pgsql-dev/osm2pgsql/pull/[email protected]>
_______________________________________________
Tile-serving mailing list
[email protected]
https://lists.openstreetmap.org/listinfo/tile-serving

Reply via email to