[OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Akos Maroy

Hi,

I'd like to inquire about the estimated disk size needed to import the 
planet osm file into PostGIS?



What I wanted to do is to import planet osm into a PostGIS that is based 
on a 512GB SSD (well, formatted and reported size: 459G, 480720592 
bytes). I did the import using osm2pgsql, using:


/usr/local/bin/osm2pgsql -d osm_world -s -C 5800 --hstore-all -K -v -G 
-m  planet-130620.osm.bz2



the end of the import is the following:

Completed planet_osm_roads
Creating osm_id index on  planet_osm_point
Creating indexes on  planet_osm_point finished
All indexes on  planet_osm_point created  in 10171s
Completed planet_osm_point
CREATE INDEX planet_osm_ways_nodes ON planet_osm_ways USING gin (nodes)  
WITH (FASTUPDATE=OFF);
 failed: ERROR:  could not extend file base/1602600/4948340.12: No 
space left on device

HINT:  Check free disk space.

Error occurred, cleaning up




of course I have more space on traditional HDDs, but my main point is to 
have the I/O intensive parts of the PostGIS database on the SSD. the OSM 
file to be imported is located on a traditional HDD (not on the SSD), 
and the osm2pgsql command is also issued from a prompt that resides on 
the HDD. the /tmp directory is not on the SSD either.


the SSD is formatted as ext4

would the planet OSM in general fit on such a disk?

are there optimization possibilities to decrease the required disk size 
for the import? maybe:


 * some temporary disk space is used during the import that could be
   moved from the SSD to the HDD?
 * some not I/O specific parts of the database files could be relocated
   to the HDD (and symlinked)?
 * some parts of the database could be omitted (like, I don't need the
   history, etc.)
 * maybe some tune2fs magic?


any pointers, ideas, etc. welcome,


Akos

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Frederik Ramm

Hi,

On 06/27/2013 09:33 AM, Akos Maroy wrote:

I'd like to inquire about the estimated disk size needed to import the
planet osm file into PostGIS?


320 GB on a machine I am running.


/usr/local/bin/osm2pgsql -d osm_world -s -C 5800 --hstore-all -K -v -G
-m  planet-130620.osm.bz2


I use --flat-nodes which saves more than 50 GB and is quicker. I don't 
use -K, and I don't use --hstore-all either; both will certainly blow up 
the space needed. I recommend using a shape file for coastlines like 
everyone else does, and to have a look at --hstore-match-only (you 
really don't need stuff *twice* in your database) and use a stripped 
style.xml that ensures you don't store tons and tons of note and 
source tags and other import side products. You don't want to burden 
your SSD with tags like gnis:Class, NHD:FType, tiger:PCICBSA, or 
canvec:UUID ;)


Bye
Frederik

--
Frederik Ramm  ##  eMail frede...@remote.org  ##  N49°00'09 E008°23'33

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Paul Norman
 From: Akos Maroy [mailto:a...@maroy.hu] 
 Subject: [OSM-dev] disk size for planet osm import into PostGIS (on an
SSD)?
 
 Hi,
 
 I'd like to inquire about the estimated disk size needed to import the
planet osm file into PostGIS?

For osm2pgsql about 240GB for the database plus 17GB for flat-nodes is my
recollection for the final size from my last tests. 
 
 What I wanted to do is to import planet osm into a PostGIS that is based
on a 512GB SSD (well, formatted and reported size: 459G, 480720592 bytes). I
did the import using osm2pgsql, using:
 
 /usr/local/bin/osm2pgsql -d osm_world -s -C 5800 --hstore-all -K -v -G -m
planet-130620.osm.bz2
 
 the end of the import is the following:
 
 ...
  failed: ERROR:  could not extend file base/1602600/4948340.12: No space
left on device
 HINT:  Check free disk space.
 
 Error occurred, cleaning up

There are a couple of ways to reduce the disk space. The first is to use
flat-nodes. This turns what was a 80GB+ nodes table for a full planet into a
17GB flat file.

One other optimization is if you're not planning on doing updates and don't
need the slim tables you can use --drop to get rid of them.

By default osm2pgsql does indexing and clustering of the tables in parallel.
This is fastest, but it results in a big spike of disk usage while
rearranging and indexing as it happens on all the tables at once. I believe
--disable-parallel-indexing will fix this.

I am quite surprised you ran into problems on a 512GB SSD. I've imported
recent planets on smaller volumes. On the other hand, I don't know anyone
who's done a full planet import without --flat-nodes lately and that
probably helps lots.

Two other tips for the next time you try are to use the .osm.pbf file
instead of .osm.bz2, and that there was a new planet file generated about
two hours ago.

Another general tip is that if your planet file is more than a day or so
old, use osmupdate (https://wiki.openstreetmap.org/wiki/Osmupdate) to update
your planet file before importing. It only takes about an hour even if it's
a week old, and that's way faster than importing diffs after.

Because osm2pgsql has *so* many options the help text will now suggest a
reasonable command for importing data, see
https://github.com/openstreetmap/osm2pgsql/commit/34a30092d6




___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Akos Maroy

Frederick,




I'd like to inquire about the estimated disk size needed to import the
planet osm file into PostGIS?


320 GB on a machine I am running.

reassuring to hear :)



/usr/local/bin/osm2pgsql -d osm_world -s -C 5800 --hstore-all -K -v -G
-m  planet-130620.osm.bz2


I use --flat-nodes which saves more than 50 GB and is quicker. I don't 
use -K, and I don't use --hstore-all either; both will certainly blow 
up the space needed. I recommend using a shape file for coastlines 
like everyone else does, and to have a look at --hstore-match-only 

would this be --hstore --hstore-match-only ?
(you really don't need stuff *twice* in your database) and use a 
stripped style.xml that ensures you don't store tons and tons of 
note and source tags and other import side products. You don't 
want to burden your SSD with tags like gnis:Class, NHD:FType, 
tiger:PCICBSA, or canvec:UUID ;)

indeed

thanks for the ideas


Akos


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Akos Maroy

Paul,



There are a couple of ways to reduce the disk space. The first is to use
flat-nodes. This turns what was a 80GB+ nodes table for a full planet into a
17GB flat file.

One other optimization is if you're not planning on doing updates and don't
need the slim tables you can use --drop to get rid of them.

By default osm2pgsql does indexing and clustering of the tables in parallel.
This is fastest, but it results in a big spike of disk usage while
rearranging and indexing as it happens on all the tables at once. I believe
--disable-parallel-indexing will fix this.
thanks, trying --flat-nodes flat-nodes --hstore --hstore-match-only 
--disable-parallel-indexing  now


I am quite surprised you ran into problems on a 512GB SSD. I've imported
recent planets on smaller volumes. On the other hand, I don't know anyone
who's done a full planet import without --flat-nodes lately and that
probably helps lots.

Two other tips for the next time you try are to use the .osm.pbf file
instead of .osm.bz2, and that there was a new planet file generated about
two hours ago.

the .bz2 file seems to work fine with bzcat


Another general tip is that if your planet file is more than a day or so
old, use osmupdate (https://wiki.openstreetmap.org/wiki/Osmupdate) to update
your planet file before importing. It only takes about an hour even if it's
a week old, and that's way faster than importing diffs after.

will look into it, thanks


Akos


___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev


Re: [OSM-dev] disk size for planet osm import into PostGIS (on an SSD)?

2013-06-27 Thread Sven Geggus
Frederik Ramm frede...@remote.org wrote:

 and use a stripped style.xml 

I assume you meant osm2pgsql style.

 that ensures you don't store tons and tons of note and source tags and
 other import side products. You don't want to burden your SSD with tags
 like gnis:Class, NHD:FType, tiger:PCICBSA, or canvec:UUID ;)

Here is the one we use on the german tileserver:
http://svn.openstreetmap.org/applications/rendering/mapnik-german/views/default.style

Sven

-- 
Das Internet ist kein rechtsfreier Raum, das Internet ist aber auch
kein bürgerrechtsfreier Raum. (Wolfgang Wieland Bündnis 90/Die Grünen)

/me is giggls@ircnet, http://sven.gegg.us/ on the Web

___
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/dev