So it's holding two tiles, per thread, per open tiled input file! 2 x RGBA half 64^2 tiles -> 64k per thread per file x 1000 files x 16 threads -> 1 GB, just for this source of overhead, not counting anything else like header data or other allocations
For 64k (two reasonably sized tiles), maybe it would be better to do a stack allocation just when the extra decode buffer is needed, so there would be no call to malloc/free and no retained memory. Switch back to a true malloc only for the rare case of huge tiles where it doesn't seem safe to do a stack allocation.\? > On Sep 16, 2016, at 11:45 AM, Karl Rasche <karlras...@gmail.com> wrote: > > > > But it's not optimal for a use pattern like TextureSystem where the typical > request is ONE tile, and the next tile it wants may not even be adjacent. > > Whoops. What I pointed at look like its only the case if you read through > Imf::InputFile. If you use Imf::TiledInputFile (like in exrinput.cpp), I > don't think you hit that buffering. > > > > Wait, I'm not quite sure how threads play into this. Is this allocated > framebuffer part of the ImageInptut itself? Do threads lock to use it? Or is > this per thread, per file? > > I think the per-thread part is around ImfTiledInputFile.cpp::267 > <https://github.com/openexr/openexr/blob/develop/OpenEXR/IlmImf/ImfTiledInputFile.cpp#L267>. > Each TileBuffer has an uncompressedData ptr which is what the compressor > fills during decode. > > This *should* just be a tile per thread, but it does look like it's held over > the lifetime of the ImfTiledInputFile. > > -- Larry Gritz l...@larrygritz.com
_______________________________________________ Openexr-devel mailing list Openexr-devel@nongnu.org https://lists.nongnu.org/mailman/listinfo/openexr-devel