On Fri, Sep 30, 2011 at 07:21:22PM +0200, Igor Podolskiy wrote: > On 29.09.2011 15:28, Gubler, Ruediger wrote: > >I have to merge a huge count of files. Doing this in one osmosis call > >creates thousands of threads and stops the rest of the system working well. > >Is it possible and efficient to split the giant merge into smaller pieces? > >What is the best strategy to merge a huge count (e.g. 100x100 matrix) > >together with a minimum of needed memory? > >Must the whole dataset fit in the memory? > Memory isn't the problem with merges, the only thing worth > mentioning that merge stores in memory are the buffers. Those are > either very small (20 entities in 0.39 release) or can be set to a > more appropriate value on the command line (in current trunk, or > HEAD now that it's in git ;)). Other than that, --merge just looks > at the next entities on the input stream and chooses one of them to > pass through downstream.
There can be a memory problem with PBF files though. PBF files contain blocks of entities, typically 8000 at a time. A PBF reader (or writer) needs to allocate buffers for those blocks and those can be quite large. The PBF spec says that an uncompressed block can be a maximum of 32 MBytes. Thats not the only buffer you need. I'd have to look a the source code to check the overall buffer size you need, but in any case its not negligable if you have many files open. Jochen -- Jochen Topf joc...@remote.org http://www.remote.org/jochen/ +49-721-388298 _______________________________________________ osmosis-dev mailing list osmosis-dev@openstreetmap.org http://lists.openstreetmap.org/listinfo/osmosis-dev