> Well, it *could* be done.  It would all boil down to providing an
> alternative for `ParseOSMData` here:
> That function is responsible for parsing the OSM file, and
> copying/converting the OSM fields into a memory structure called
> `ExtractionContainers`:

Thank you, this is really useful information

>     0) You'll have to implement it - the core team have other priorities.

That's fair
I could eventually propose a complete alternative to osrm-extract as to
clearly separate responsibilities.

>     1) I suspect you wouldn't see a huge performance improvement - I
> suspect the overhead of querying postgis would dominate the extractor time.

I have about 5M ways and 40M nodes.
Producing xml or pbf takes hours, while querying postgis takes about 3min.
I would probably agree that the process of postgis output isn't well

>     2) You'll be on the hook for maintaining this code - the core team
> haven't built this into the core tool because we don't need it, and it's a
> big ask for us to maintain something we don't use.
>   I'd strongly consider trying to optimize your Postgres->OSM extraction
> process.  Consider using `osmium` libraries to write out the data in PBF
> form directly instead of XML - it's significantly smaller, which makes it
> faster to move around and write to disk, and OSRM will import it more
> quickly.

Problem isn't to get data from postgis, but to organize them as to fit in
the xml :
- Creating nodes records, out of Linestrings / polygon geometries
- Search and deduplicate them, especially when 2 or more ways have nodes
located on the same lat/lon point. Currently done with a GROUP BY on nodes
- Create numeric and auto incremented ids since we use uuid in postgis (the
easy part)
- Iterate over all of this to produce a xml with Python. Didn't try c++
libosmium for now but I know i should. That's the longer part in the
current process.

This takes hours, and I'll be really happy if I find a way to directly feed
osrm graph without recreating such things.

Simple suppositions:
It would be so nice to not have to produce nodes out of geometries. It's
the key point.
I guess you don't have proper records for each nodes in .osrm files don't
you ?
Once they gone through profile's node_process, we only need their
coordinates and not their tags any more.
Then it would be great to only send tagged nodes (coming from dedicated
postgis tables) to osrm-extract.

