So it's holding two tiles, per thread, per open tiled input file!

2 x RGBA half 64^2 tiles -> 64k per thread per file
x 1000 files x 16 threads -> 1 GB, just for this source of overhead, not 
counting anything else like header data or other allocations

For 64k (two reasonably sized tiles), maybe it would be better to do a stack 
allocation just when the extra decode buffer is needed, so there would be no 
call to malloc/free and no retained memory. Switch back to a true malloc only 
for the rare case of huge tiles where it doesn't seem safe to do a stack 
allocation.\?


> On Sep 16, 2016, at 11:45 AM, Karl Rasche <karlras...@gmail.com> wrote:
> 
> 
> 
> But it's not optimal for a use pattern like TextureSystem where the typical 
> request is ONE tile, and the next tile it wants may not even be adjacent.
> 
> Whoops. What I pointed at look like its only the case if you read through 
> Imf::InputFile.  If you use Imf::TiledInputFile (like in exrinput.cpp), I 
> don't think you hit that buffering.
> 
> 
> 
> Wait, I'm not quite sure how threads play into this. Is this allocated 
> framebuffer part of the ImageInptut itself? Do threads lock to use it? Or is 
> this per thread, per file?
> 
> I think the per-thread part is around ImfTiledInputFile.cpp::267 
> <https://github.com/openexr/openexr/blob/develop/OpenEXR/IlmImf/ImfTiledInputFile.cpp#L267>.
>  Each TileBuffer has an uncompressedData ptr which is what the compressor 
> fills during decode. 
> 
> This *should* just be a tile per thread, but it does look like it's held over 
> the lifetime of the ImfTiledInputFile. 
> 
> 

--
Larry Gritz
l...@larrygritz.com


_______________________________________________
Openexr-devel mailing list
Openexr-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/openexr-devel

Reply via email to