Re: [OSM-dev] Slow osmosis import

2019-10-05 Thread merspieler
I did some more testing.
I've taken a smaller area and put everything into a tmpfs but even with
the .pbf as well as the tmp files of osmosis both being stored in ram,
the performance isn't too good. It improved to about 1500 objects/second
but this would still means that all ways (according to this [1]) need
113 hours which is way too long.

I've documented the import of a similar file in size (about 4GB) about
half a year ago, running on HDD only, which was completed in just over
11 hours.

Now this was a different machine but I can't explain, why it is that
slow. One thing caught my attention tho, osmosis seems to use only one
thread... Not sure, if that's the bottleneck

[1] https://taginfo.openstreetmap.org/reports/database_statistics

Frederik Ramm:
> Hi,
> 
> the osm2city software should be changed to use an osm2pgsql database
> instead of an osmosis database. Not only can a planet be imported in
> less than a day with osm2pgsql (if you have SSDs), but also the
> osm2pgsql database already has correctly built geometries for all
> objects, whereas osm2city has to make an effort to build these
> geometries from raw OSM data,thereby re-inventing the wheel when it
> comes to the interpretation of multipolygon relations, the treatment of
> way-based vs. relation-based polygons, etc.
> 
> osm2city does not seem to use anything that could *not* be found in an
> osm2pgsql import.
> 
> If you insist on continuing down your current path then you must either
> equip your computer with fast SSDs, or temporarily rent a large-SSD
> Amazon instance on which you can do your import and then copy over the
> resulting database (if you choose a setup where the importing instance
> has the same CPU architecture, as well as exactly the same OS and
> PostgreSQL/PostGIS versions, then you can copy over the raw database
> directory). But even this is likely to take at least a week if not
> several for the import - osmosis imports are just not something people
> do normally on a planet scale.
> 
> I have only cursorily looked at the osm2city source code and it seems
> that it uses most of OSM's data (buildings, roads, landuse). If you
> should be in a situation where you only need some of OSM's data then a
> speedup could be gained by first running "osmium tags-filter" to extract
> the data you really need from the planet file. But if the list of "data
> you need" contains roads and buildings and landuse then you might as
> well not filter, since those categories make up the bulk of OSM data.
> 
> Bye
> Frederik
> 

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-05 Thread Frederik Ramm
Hi,

the osm2city software should be changed to use an osm2pgsql database
instead of an osmosis database. Not only can a planet be imported in
less than a day with osm2pgsql (if you have SSDs), but also the
osm2pgsql database already has correctly built geometries for all
objects, whereas osm2city has to make an effort to build these
geometries from raw OSM data,thereby re-inventing the wheel when it
comes to the interpretation of multipolygon relations, the treatment of
way-based vs. relation-based polygons, etc.

osm2city does not seem to use anything that could *not* be found in an
osm2pgsql import.

If you insist on continuing down your current path then you must either
equip your computer with fast SSDs, or temporarily rent a large-SSD
Amazon instance on which you can do your import and then copy over the
resulting database (if you choose a setup where the importing instance
has the same CPU architecture, as well as exactly the same OS and
PostgreSQL/PostGIS versions, then you can copy over the raw database
directory). But even this is likely to take at least a week if not
several for the import - osmosis imports are just not something people
do normally on a planet scale.

I have only cursorily looked at the osm2city source code and it seems
that it uses most of OSM's data (buildings, roads, landuse). If you
should be in a situation where you only need some of OSM's data then a
speedup could be gained by first running "osmium tags-filter" to extract
the data you really need from the planet file. But if the list of "data
you need" contains roads and buildings and landuse then you might as
well not filter, since those categories make up the bulk of OSM data.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-05 Thread Marco Boeringa

Hi,

Building ways and relations requires fast random access, not sequential 
read / write speed. I think it likely your HDD raid is the culprit, as 
the 96 RAM won't allow you to process everything in RAM. All of the 
recent osm2pgsql benchmarks with high throughput for building ways and 
relations I have seen assume SSDs, and preferably even NVMe if you can 
afford it.


It is likely even a basic USB 3 connected 4TB SATA SSD will give you 
better results.


Marco

Op 5-10-2019 om 00:47 schreef merspieler:

I've wanted to use osm2pgsql but the schema is a different one.
The software [1] I'm going to use the db with only supports the osmosis one.

As for the hardware:
2x Xeon E5 8 cores/16 threads
96GB ram
5x 4TB HDD in a RAIDZ2

I've done some benchmarking of the raid and osmosis doesn't even reach
5% of what was possible with the benchmark.

I haven't tried Imposm yet... does it work with the osmosis schema?

[1] https://gitlab.com/fg-radi/osm2city

Frederik Ramm:

Hi,

first question: are you absolutely sure you need an Osmosis import -
does your use case not work with an osm2pgsql import?

Best
Frederik


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-04 Thread merspieler
The same problem applies to a 3.4 GB .pbf file.
The nodes were done quickly but as soon as it started processing the
ways, it got super slow.

merspieler:
> I've imported small extracts in the past but I've never actually
> monitored the performance of these as they were done in reasonable time.
> I'll try a smaller area again...
> 
> Yves:
>> No, Imposm as it's own schema.
>> I never used Osmosis to import a complete planet file, but I would find 
>> reasonable to start with a small extract like stated in osm2city 
>> documentation.
>> Yves
>>
>>
>> ___
>> dev mailing list
>> dev@openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/dev
>>
> 
> ___
> dev mailing list
> dev@openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev
> 

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-04 Thread Yves
No, Imposm as it's own schema.
I never used Osmosis to import a complete planet file, but I would find 
reasonable to start with a small extract like stated in osm2city documentation.
Yves___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-04 Thread merspieler
I've wanted to use osm2pgsql but the schema is a different one.
The software [1] I'm going to use the db with only supports the osmosis one.

As for the hardware:
2x Xeon E5 8 cores/16 threads
96GB ram
5x 4TB HDD in a RAIDZ2

I've done some benchmarking of the raid and osmosis doesn't even reach
5% of what was possible with the benchmark.

I haven't tried Imposm yet... does it work with the osmosis schema?

[1] https://gitlab.com/fg-radi/osm2city

Frederik Ramm:
> Hi,
> 
> first question: are you absolutely sure you need an Osmosis import -
> does your use case not work with an osm2pgsql import?
> 
> Best
> Frederik
> 

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-04 Thread Yves
Same as Frederic, but also proposing Imposm, also quite fast.
A brief hardware description would allow to exclude some bottlenecks.
Yves ___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] Slow osmosis import

2019-10-04 Thread Frederik Ramm
Hi,

first question: are you absolutely sure you need an Osmosis import -
does your use case not work with an osm2pgsql import?

Best
Frederik

-- 
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09" E008°23'33"

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev


[OSM-dev] Slow osmosis import

2019-10-04 Thread merspieler
I'm currently trying to import the planet.osm.pbf file with osmosis.
While it's quite fast with nodes (took about 7h) it massively lows down
when it comes to the ways.

INFO: Processing Node 6814667967, 307145.5708858228 objects/second.
Oct 04, 2019 9:13:37 AM
org.openstreetmap.osmosis.core.progress.v0_6.EntityProgressLogger process
INFO: Processing Node 6816253174, 296264.7470505899 objects/second.
Oct 04, 2019 9:13:43 AM
org.openstreetmap.osmosis.core.progress.v0_6.EntityProgressLogger process
INFO: Processing Way 92, 5606.095551894564 objects/second.
Oct 04, 2019 9:13:48 AM
org.openstreetmap.osmosis.core.progress.v0_6.EntityProgressLogger process
INFO: Processing Way 111, 2.3990403838464616 objects/second.
Oct 04, 2019 9:13:54 AM
org.openstreetmap.osmosis.core.progress.v0_6.EntityProgressLogger process
INFO: Processing Way 135, 2.727272727272727 objects/second.

It would take years to import it at this speed.

I run it like this:
osmosis --read-pbf file=path/to/planet.osm.pbf --log-progress
--write-pgsql database=osm_world password="not your business"

$JAVACMD_OPTIONS -Djava.io.tmpdir is set to use the same RAID as the
.pbf file and database. I've added as well -Xmx40G to see if it has
something to do with the ram but that didn't change anything.

osmosis never gets to the limits of the raid so i think i can rule out
IO performance as well.

how do i get osmosis up to speed so it will finish in a matter of days?

___
dev mailing list
dev@openstreetmap.org
https://lists.openstreetmap.org/listinfo/dev