Ah. Well, the step of deduplicating and finding connected nodes will not
go away - OSRM requires a *graph*, not disconnected geometry, so you won't
be able to get away from the problem of turning your LineStrings into
OSM's data structure (ways & nodes) has this structure built in, so OSRM
doesn't contain any of this kind of logic already.
You might want to look at how pgRouting does this bit (they call it
"building the topology") -
there might be some performance tips you can get from their approach.
On Fri, Mar 2, 2018 at 10:13 AM, François Lacombe <fl.infosrese...@gmail.com
> Hi Daniel,
> 2018-03-02 18:31 GMT+01:00 Daniel Patterson <dan...@mapbox.com>:
>> Well, it *could* be done. It would all boil down to providing an
>> alternative for `ParseOSMData` here:
>> That function is responsible for parsing the OSM file, and
>> copying/converting the OSM fields into a memory structure called
> Thank you, this is really useful information
>> 0) You'll have to implement it - the core team have other priorities.
> That's fair
> I could eventually propose a complete alternative to osrm-extract as to
> clearly separate responsibilities.
>> 1) I suspect you wouldn't see a huge performance improvement - I
>> suspect the overhead of querying postgis would dominate the extractor time.
> I have about 5M ways and 40M nodes.
> Producing xml or pbf takes hours, while querying postgis takes about 3min.
> I would probably agree that the process of postgis output isn't well
>> 2) You'll be on the hook for maintaining this code - the core team
>> haven't built this into the core tool because we don't need it, and it's a
>> big ask for us to maintain something we don't use.
>> I'd strongly consider trying to optimize your Postgres->OSM extraction
>> process. Consider using `osmium` libraries to write out the data in PBF
>> form directly instead of XML - it's significantly smaller, which makes it
>> faster to move around and write to disk, and OSRM will import it more
> Problem isn't to get data from postgis, but to organize them as to fit in
> the xml :
> - Creating nodes records, out of Linestrings / polygon geometries
> - Search and deduplicate them, especially when 2 or more ways have nodes
> located on the same lat/lon point. Currently done with a GROUP BY on nodes
> - Create numeric and auto incremented ids since we use uuid in postgis
> (the easy part)
> - Iterate over all of this to produce a xml with Python. Didn't try c++
> libosmium for now but I know i should. That's the longer part in the
> current process.
> This takes hours, and I'll be really happy if I find a way to directly
> feed osrm graph without recreating such things.
> Simple suppositions:
> It would be so nice to not have to produce nodes out of geometries. It's
> the key point.
> I guess you don't have proper records for each nodes in .osrm files don't
> you ?
> Once they gone through profile's node_process, we only need their
> coordinates and not their tags any more.
> Then it would be great to only send tagged nodes (coming from dedicated
> postgis tables) to osrm-extract.
> Enjoy your weekend,
> OSRM-talk mailing list
OSRM-talk mailing list