So it's holding two tiles, per thread, per open tiled input file!

2 x RGBA half 64^2 tiles -> 64k per thread per file
x 1000 files x 16 threads -> 1 GB, just for this source of overhead, not 
counting anything else like header data or other allocations

For 64k (two reasonably sized tiles), maybe it would be better to do a stack 
allocation just when the extra decode buffer is needed, so there would be no 
call to malloc/free and no retained memory. Switch back to a true malloc only 
for the rare case of huge tiles where it doesn't seem safe to do a stack 

> On Sep 16, 2016, at 11:45 AM, Karl Rasche <> wrote:
> But it's not optimal for a use pattern like TextureSystem where the typical 
> request is ONE tile, and the next tile it wants may not even be adjacent.
> Whoops. What I pointed at look like its only the case if you read through 
> Imf::InputFile.  If you use Imf::TiledInputFile (like in exrinput.cpp), I 
> don't think you hit that buffering.
> Wait, I'm not quite sure how threads play into this. Is this allocated 
> framebuffer part of the ImageInptut itself? Do threads lock to use it? Or is 
> this per thread, per file?
> I think the per-thread part is around ImfTiledInputFile.cpp::267 
> <>.
>  Each TileBuffer has an uncompressedData ptr which is what the compressor 
> fills during decode. 
> This *should* just be a tile per thread, but it does look like it's held over 
> the lifetime of the ImfTiledInputFile. 

Larry Gritz

Openexr-devel mailing list

Reply via email to