Re: [osmosis-dev] Merge huge count of files

Jochen Topf Fri, 30 Sep 2011 23:49:24 -0700

On Fri, Sep 30, 2011 at 07:21:22PM +0200, Igor Podolskiy wrote:
> On 29.09.2011 15:28, Gubler, Ruediger wrote:
> >I have to merge a huge count of files. Doing this in one osmosis call
> >creates thousands of threads and stops the rest of the system working well.
> >Is it possible and efficient to split the giant merge into smaller pieces?
> >What is the best strategy to merge a huge count (e.g. 100x100 matrix)
> >together with a minimum of needed memory?
> >Must the whole dataset fit in the memory?
> Memory isn't the problem with merges, the only thing worth
> mentioning that merge stores in memory are the buffers. Those are
> either very small (20 entities in 0.39 release) or can be set to a
> more appropriate value on the command line (in current trunk, or
> HEAD now that it's in git ;)). Other than that, --merge just looks
> at the next entities on the input stream and chooses one of them to
> pass through downstream.


There can be a memory problem with PBF files though. PBF files contain blocks
of entities, typically 8000 at a time. A PBF reader (or writer) needs to
allocate buffers for those blocks and those can be quite large. The PBF spec
says that an uncompressed block can be a maximum of 32 MBytes. Thats not the
only buffer you need. I'd have to look a the source code to check the overall
buffer size you need, but in any case its not negligable if you have many files
open.

Jochen
-- 
Jochen Topf  joc...@remote.org  http://www.remote.org/jochen/  +49-721-388298


_______________________________________________
osmosis-dev mailing list
osmosis-dev@openstreetmap.org
http://lists.openstreetmap.org/listinfo/osmosis-dev

Re: [osmosis-dev] Merge huge count of files

Reply via email to