I think the TBB allocator is pretty good, though I believe its main goal is to be scalable when many threads are asking from memory at once, which may be at the cost of total memory footprint size.
We've liked jemalloc quite a bit, for a good compromise between speed and memory size. That may be worth trying. When you read the image, you should try this: ImageBuf buf(filename); buf.read(0 /*submage*/, 0 /*miplevel*/, 0 /*chbegin*/, 1 /*chend*/, false /*force*/); Using the ImageBuf::read that lets you specifiy a channel range will at least mean that the ImageBuf, and any subsequent calculation you do on it, will be 1 channel, even if the underlying cache is 3 channels. If you want to get tricky, we can keep the cache size down, too, with this strategy: // Ask the cache for the spec of the file, but doctor to 1 channel ImageSpec spec; imagecache->get_imagespec(filename, spec); spec.nchannels = 1; // Allocate an ImageBuf with local memory exactly the right size, and don't // waste time zeroing out the pixels because we're about to write over them. ImageBuf buf(spec, ImageBuf::InitializePixels::Yes); // Ask the image cache to copy the right channels imagecache->get_pixels(filename /*or handle*/, 0 /*subimage*/, 0 /*miplevel*/, 0, width, 0, height, 0, 1, 0 /*chbegin*/, 1 /*chend*/, TypeUInt8 /*format*/, buf.localpixels(), // <--- read into the buf's memory AutoStride, AutoStride, AutoStride, 0 /*cache_chbegin*/, 1 /*cache_chend*/); Those last params to get_pixels tell the ImageCache that you only want to cache the first channel of the file you're reading from! So now in this iteration, we are doing image processing on one channels (because of the ImageBuf), and we are caching only one channel (because of those cache_chbegin/cache_chend args to get_pixels), but the I/O itself is still kind of expanding the palette scanlines out to 3 channels before copying and storing only one channel into the cache. So the cache *size* is as good as it can be, and so is the I/O, and so is the image processing... but there's a bit of useless copying going on. But maybe that's enough? The only improvement to make on top of that is to fix bmpinput.cpp itself as I described. I'll explain that in the next email, because my burrito delivery just arrived. > On Apr 5, 2021, at 6:49 PM, Phil Miller <philip.mil...@sri.com> wrote: > > Thanks! I greatly appreciate your replies. > > I will put together a small scale test and try things out. So, what is the > right/easy/fast way to convert the 3-channel RGB image that is loaded from > the bmp file into a single channel image? (I want to stay with the underlying > 8-bit per pixel format.) > > As for my use case, yes, my usage will be such that when I return to an image > it will be soon after it is loaded into the cache; and should almost always > be before I will I have filled the cache with so many images it will be > flushed from the cache. > > Regarding memory fragmentation, I have had bad experiences in the past with > fragmentation, but my experience could be dated and no longer relevant for > the current state of compilers/operating systems (with my primary platform > being 64-bit windows 10 with Visual Studio 2019). Also, I am using the [TBB > Scalable Memory > Allocator](https://www.threadingbuildingblocks.org/docs/help/index.htm#tbb_userguide/Scalable_Memory_Allocator.html > > <https://www.threadingbuildingblocks.org/docs/help/index.htm#tbb_userguide/Scalable_Memory_Allocator.html>) > as a drop-in custom allocator injected at the link stage. But, I would > appreciate hearing about other allocators that you would suggest. With my > current tests, I am seeing a few hundred thousand page faults per second, and > this concerns me as my full-up application will need to run for 10-30 days. > > Finally, I would be willing to modify the oiio bmp reader to support a single > channel bmp file. I believe the common interpretation is to have a color > palette with an identity lookup table (256 entries with values 0-255 mapped > to 0-255) is not really a hack – just not your normal 24-bit color bmp. If > you point me in the right direction, I will try to make the change. > > From: Oiio-dev <oiio-dev-boun...@lists.openimageio.org> On Behalf Of Larry > Gritz > Sent: Monday, April 5, 2021 9:12 PM > To: OpenImageIO dev list <oiio-dev@lists.openimageio.org> > Subject: Re: [Oiio-dev] [EXTERNAL] Re: Reading single channel bmp > > ImageCache may still be the right solution. > > How frequently will you reuse the images, and how big is your cache compared > to the image size? > > Let's say that the total cache size you are willing to allocate (with > ImageCache::attribute("max_memory_MB")) can hold N images. And let's also say > that the access pattern is such that after using image A, you will tend to > need to access n different images on average before returning to need A again. > > If n >> N almost always, then the cache is probably not doing you any good, > and you're paying for the overhead of IC without benefit. > > But if with some frequency you return to image A soon enough that you have > touched < n other images in the mean time, so that A is still in the cache, > then you win. It doesn't need to happen all the time. But it's likely that > reading an image from disk is MUCH more expensive than any of the processing > you're doing on each image, so removing the need for even a small percentage > of redundant reads is probably well worth it. > > I think you are worrying needlessly about memory fragmentation. Modern memory > allocators are pretty smart about this. You'll only get fragmentation if you > have an alloc/free pattern where you ask for something big, return it, ask > for a bunch of medium size things that get allocated from the chunk that was > the original allocation, then ask for something big again and need new > memory, and so on. > > If you ask for something big, free it, then ask for another thing of the same > size, then free it, etc. -- memory allocators are really good about that. > Also, they tend to have different arenas for different size ranges, so your > small allocations will never subdivide your very big allocations. If your > default system allocator is being a pig, there are many other choices you can > try that are meant to optimize exactly this kind of thing. Using a custom > memory allocator (I can give you some suggestions) that simply replaces > malloc/free at the link stage without needing any changes to source code at > all, is a much easier solution than tying yourself in knots worrying about > things on the app side. > > I'm also wondering if that n and N relationship we mentioned earlier is such > that worrying about the grayscale trick makes a lot of difference. If it's > right on the cusp, then using 1/3 the cache space could be very helpful, but > if the access pattern is less sensitive, it may be not worth sweating. > > I would try a small scale test of this without thinking about it too hard, > and look at the ImageCache stats output to see how much redundant I/O is > happening (compare bytes of I/O read total, versus bytes in all the unique > images read), and also how many "tiles" are ever allocated versus how many > unique images. If they aren't too out of whack, then just roll with it, > you're already pretty efficient. > > If you *really* think you're right on the edge of having this work well and > are being thwarted only by the grayscale vs RGB thing, I can take a look at > it for you and see if it's quick to implement what I suggested. > > -- lg > > > > On Apr 5, 2021, at 5:46 PM, Phil Miller <philip.mil...@sri.com > <mailto:philip.mil...@sri.com>> wrote: > > Thank you for your reply. > > My access pattern is a combination of the ones you describe. > > I will access all pixels in an image, but I may re-read an image multiple > times before I am finished with it. So I thought the ImageCache would be an > easy way to avoid the cost of the redundant reads. Granted this is overkill, > since I could make my own, separate, whole-image, cache, but I would rather > re-use the proven oiio ImageCache than debug my own. Plus, in the future, I > may have a finer-grained use case where I am using tiled images or want to > access different mipmap levels. > > But, I will need to process all of the images, so there is no way the image > cache can be big enough to hold all of them. What I would like to do is to > have the ImageCache re-use the memory from an expired image rather than free > the old one and alloc a new one. The cost I am concerned about is not the > computation or time cost of the alloc/free cycle, but the memory > fragmentation that may occur. Since I already have a 15 MB block of memory > for an expired image, I would like to re-use that rather than free it and > alloc a new one (that is the exact same size as the one I just freed). This > way, the image reading part of the processing will (eventually) allocate 10 > GB of memory and no more memory allocation will be needed for reading the > rest of the 1 TB of images that I have to process. > > > From: Oiio-dev <oiio-dev-boun...@lists.openimageio.org > <mailto:oiio-dev-boun...@lists.openimageio.org>> On Behalf Of Larry Gritz > Sent: Monday, April 5, 2021 8:09 PM > To: OpenImageIO dev list <oiio-dev@lists.openimageio.org > <mailto:oiio-dev@lists.openimageio.org>> > Subject: Re: [Oiio-dev] [EXTERNAL] Re: Reading single channel bmp > > Whether ImageCache is a good idea depends on what the access pattern is. > > The case that ImageCache is most helpful is for when your access pattern is > visiting and revisiting the same image (out of many many images), and each > time you need to visit the image, you only need to access a few of the > pixels, more or less coherently, and then the next time you need pixels from > that image, it's either likely to be in a short time and needing nearby > pixels, or not for a long time until you need a far-away part of that image. > Also, this whole scheme works best when the images are *tiled*, which BMPs > are not. When the usage patterns align, ImageCache can be used, for example, > for an offline renderer to access 2TB of texture data per frame, spread > across 20,000 texture files, all without using more than a few GB in memory > at any one time (and with maybe only 20% redundant disk reads). > > If you are reading one image at a time, need all the pixels in the image > before moving on to the next image, and then are done with that first image, > then you should not be using ImageCache at all. > > You can still use an ImageBuf if you want, bypassing the ImageCache, by > making sure that when you call ImageBuf::read(), you pass the optional > parameter force=true. But when you free one ImageBuf and make the next one, > it will free and then re-allocate the memory. I assure you, although this > constant alloc/free cycle seems wasteful, it will be a tiny fraction of the > cost of the actual disk I/O and the compute necessary to decompress the pixel > data and you will barely notice it. > > Another route you could take -- again, assuming that all these images are the > same size and you only need one in memory at a time -- is to not use ImageBuf > either. Just allocate your own buffer of the right size, use > ImageInput::read_image to get the pixels into it, process it, and then read > the next image into the same memory buffer without ever freeing it. > > ImageInput, ImageBuf, and ImageCache are different levels of abstraction, > intended for different use cases. > > > > On Apr 5, 2021, at 4:55 PM, Phil Miller <philip.mil...@sri.com > <mailto:philip.mil...@sri.com>> wrote: > > Well, our application is a little newer than horse-drawn wagons, and we do > use bmp files from time to time. This particular usage is for an instrument > that is outputting sensor data at scan locations using bmp images. Bmp images > tend to be simple, easy to use, well supported on both windows and linux, and > (still 😊) popular for certain applications. For my immediate use case, I am > not in control of the format and am just given the images that we need to > process. I have been able to use opencv to load the bmp images, but the > opencv reader is doing a memory allocation for each image that I read. > Consequently, I am churning through way too many memory allocations, as > opencv is doing an alloc, read, and free for each image that I read. What I > am trying to do is have an image cache that resuses images and avoids > unnecessary allocations. So I was looking to see if oiio would be better. > > Since oiio considers bmp files to be an obsolete format (and I concur that > there are many better image formats than bmp), let me ask a different > question: > If I choose to store my images (and their pyramids/mipmaps) in a tif file, > can I use oiio and its ImageCache to re-use a pre-existing memory block to > read the pixel values into? In particular, suppose I am willing to use a 10 > GB image cache. After the cache has filled up and it needs to free an image > before reading another one, will the cache reuse the memory block or free the > existing one and allocate a new one? Does it matter, that is will it help, if > all of my images are the same size? If I can get oiio to reduce the memory > allocations, then I can write a pre-processing application to use opencv to > convert the single channel bmp files to mipmapped tif files and then write > the real image processing app using oiio. > > Thanks for your time! > > From: Larry Gritz <l...@larrygritz.com <mailto:l...@larrygritz.com>> > Sent: Monday, April 5, 2021 6:53 PM > To: OpenImageIO dev list <oiio-dev@lists.openimageio.org > <mailto:oiio-dev@lists.openimageio.org>> > Cc: Phil Miller <philip.mil...@sri.com <mailto:philip.mil...@sri.com>> > Subject: Re: [Oiio-dev] [EXTERNAL] Re: Reading single channel bmp > > Yeah, sorry about that. I fixated so quickly on "is the texture 1->3 channel > conversion somehow going awry" that I failed to notice that you said BMP, > which is an unusual enough file format that there could be something > particular to its reader. > > And indeed, that is the case. > > There's not really such a thing as a BMP "grayscale" image. > > As I understand it, BMP files are either RGB[A] full color per pixel (24 or > 32 bits), or 16 bits per pixel (where those 16 bits encode R, G, and B), or > else are "palette" images. For palette images, the number of bits per pixel > indicates the number of palette entries, e.g., 8 bits means 256 entries in > the color palette, 4 bits means only 16 colors in the palette. But either > way, the palette itself consists of entries of what they call an "RGBQUAD", > so these palette images are reported by OIIO to be RGB images (OIIO hides the > palette itself from you so that apps don't need to worry about it and can > just ask for pixel colors). > > You could make an 8-bit per pixel BMP appear to encode a grayscale image by > making it a palette image in which all the palette entries coincidentally > have their RGB triplets all have equal values. But that is just trickery -- > it's still a color image, just a very, very gray-looking one because of the > special choice of which colors are in the palette. > > I could imagine some surgery of the BMP reader that would make it operate as > follows: The palette is examined for whether every palette entry has the > property that R==G==B, and if so, it could report the file as being a true > 1-channel image and the palette conversion for pixels can just be result = > palette[pixelval][0]. If any of the palette entries have R,G,B not all equal, > then it reports back as if it were a RGB 3-channel image as before. This is > kind of a hack -- it's taking that the BMP space says is a RGB image and > lying to report it back as at 1-channel grayscale, just because "all the RGB > colors in that image look gray enough to me". Still, it's a neat idea, and I > would welcome a patch from anybody who would make this change, but I don't > have time at the moment to do it myself. > > I always thought that BMP was primarily for low-res old Windows images such > as file icons and such, and other stuff from back when images were assembled > by horse-drawn wagons. I'm a little surprised to hear that anybody is dealing > with a MILLION bmp images at 4k resolution. Can I ask what is the application > of this and whether it might be better to use a different file format? Maybe > one that not only has a proper grayscale, but also has better compression > methods than are supported by BMP? > > -- lg > > > > > > On Apr 5, 2021, at 3:11 PM, Larry Gritz <l...@larrygritz.com > <mailto:l...@larrygritz.com>> wrote: > > Oh, bmp specifically, I get it. Hang on, let me look at that case. > > > > > > On April 5, 2021 2:53:56 p.m. PDT, Phil Miller <philip.mil...@sri.com > <mailto:philip.mil...@sri.com>> wrote: > No, when I was debugging, the spec comes out with nchannels=3 instead of 1. > Let me extract some sample reader code that I can share and I will get back > to you. > > From: Oiio-dev <oiio-dev-boun...@lists.openimageio.org > <mailto:oiio-dev-boun...@lists.openimageio.org>> On Behalf Of Larry Gritz > Sent: Monday, April 5, 2021 2:58 PM > To: OpenImageIO dev list <oiio-dev@lists.openimageio.org > <mailto:oiio-dev@lists.openimageio.org>> > Subject: [EXTERNAL] Re: [Oiio-dev] Reading single channel bmp > > There is logic in the TextureSystem so that individual texture lookups will > expand a 1-channel greyscale image to all 3 channels of a 3-channel lookup > (versus using the "fill" color for the second two channels). > > Even when using that TextureSystem option, it's only about what data value it > copies into a user result for that individual texture lookup. It doesn't > actually duplicate any image data in memory in the cache. > > But there is nothing in ImageCache or ImageBuf, as far as I can tell, that > does anything like that. An ImageBuf that references a 1-channel file should > just report as 1 channel. > > Are you doing something that indicates that it's actually replicating the > data? Or are you just worried, based on this option of the texture system, > that something weird might end up happening for an ordinary ImageBuf? > > > > > > > On Apr 5, 2021, at 11:30 AM, Phil Miller <philip.mil...@sri.com > <mailto:philip.mil...@sri.com>> wrote: > > I am having trouble using an ImageBuf with an ImageCache to read a single > channel (grayscale/monochrome) bmp image as a single channel image. The spec > in the internal oiio logic seems to always expand to 3 channels. I can use > the force flag when I read it, but then that bypasses the cache. The images > are fairly large (4k x 3k) and I need to read a million of them, so the > difference between one and three channels is significant. > > I appreciate any help and guidance. > > Sincerely, > Phil > _______________________________________________ > Oiio-dev mailing list > Oiio-dev@lists.openimageio.org <mailto:Oiio-dev@lists.openimageio.org> > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org > <https://urldefense.us/v3/__http:/lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org__;!!Nv3xtKNH_4uope0!xOYMMZQ_G451K0HfPRe1lqeW_Z4EG0Oep39UNfj0cKUGt0-jxrGFh-9CA29gUrlK$> > > -- > Larry Gritz > l...@larrygritz.com <mailto:l...@larrygritz.com> > > > > > > -- > Larry Gritz > l...@larrygritz.com <mailto:l...@larrygritz.com> > _______________________________________________ > Oiio-dev mailing list > Oiio-dev@lists.openimageio.org <mailto:Oiio-dev@lists.openimageio.org> > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org > <https://urldefense.us/v3/__http:/lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org__;!!Nv3xtKNH_4uope0!0-H9hVMU1slRpUi7NepKlIBTy-jWTsIhbzrnS5Bn8mOuMUAKBqCUcUgc2CfVaPes$> > > -- > Larry Gritz > l...@larrygritz.com <mailto:l...@larrygritz.com> > > > > > _______________________________________________ > Oiio-dev mailing list > Oiio-dev@lists.openimageio.org <mailto:Oiio-dev@lists.openimageio.org> > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org > <https://urldefense.us/v3/__http:/lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org__;!!Nv3xtKNH_4uope0!25_cHMQAGvgtRdtA8qrfhZei8-l42yZqB-KFY0uOwXbs-jaYYh-a74e_BAcev3tG$> > > -- > Larry Gritz > l...@larrygritz.com <mailto:l...@larrygritz.com> > > > > > _______________________________________________ > Oiio-dev mailing list > Oiio-dev@lists.openimageio.org <mailto:Oiio-dev@lists.openimageio.org> > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org > <https://urldefense.us/v3/__http:/lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org__;!!Nv3xtKNH_4uope0!yNF9b4TAIGIMwps14vhz3fxBSAupg4kFzxuetkePBk2nlcH6IUwlXrrZL46wmwI-$> > > -- > Larry Gritz > l...@larrygritz.com <mailto:l...@larrygritz.com> > > > > > _______________________________________________ > Oiio-dev mailing list > Oiio-dev@lists.openimageio.org > http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org -- Larry Gritz l...@larrygritz.com
_______________________________________________ Oiio-dev mailing list Oiio-dev@lists.openimageio.org http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org