On Wed, Apr 2, 2014 at 9:08 PM, Pedro Côrte-Real <pe...@pedrocr.net> wrote: > Having read through the code in more detail here's a possible > suggestion on how to do the minimum possible thing that may just work: > > Leave the DT_MIPMAP_F and DT_MIPMAP_FULL levels just as they are. > For levels DT_MIPMAP_0 through DT_MIPMAP_3: > 1) whenever an image is about to be removed from the cache write it > out to disk before > 2) whenever you have a cache miss try to see if the image is on disk > before recreating it from the original image > 3) whenever an image gets changed remove it from the disk > 4) potentially change the sizes so that DT_MIPMAP_F can be a large > size and yet the thumbnail levels be smaller (say 800x600 or lower) > > This will make the penalty for a cache miss much lower (a random read > from disk of a much smaller file than the original image). The cost > for this is writing the files to disk, but that gets amortized over > the life of the collection, which should be long, especially if > DT_MIPMAP_3 is set to a fixed size so that changing the thumbnail > sizes doesn't invalidate these cached files.
To test this out more numerically I created a simple micro-benchmark with darktable-cli. I added a "--loadimages N" option that will load the N first images in the library from the cache, reading them in if needed. https://github.com/pedrocr/darktable/commit/b23f484fc03e2a0d056c34a81e621675330c46af I then ran this with these settings: - Cache size 200MB - Max size as 800x600 - 1 thread, 500MB tiling memory, 8MB single buffer The result of this in cache slots is: DT_MIPMAP_3, with 8 slots DT_MIPMAP_2, with 64 slots DT_MIPMAP_1, with 256 slots DT_MIPMAP_0, with 4096 slots Since DT_MIPMAP_1 is 200x150 I used that to run the test. The idea is to simulate scrolling through a large collection so that is a reasonable thumbnail size. Cold cache results are after deleting ~/.cache/darktable. Hot cache results are run after the cold pass has generated the cache file. All results are from "time darktable-cli ..." so include the loading of the cache file as well as any disk loading of images. Here are the results with thumbnails being calculated from embedded JPEGs: 10 images - 2.69s cold, 0.491s hot 100 images - 12.142s cold, 0.707s hot 1000 images - 102.919s cold, 101.166s hot Here are the results with thumbnails being calculated from half-size raws 10 images - 7.117s cold, 0.505s hot 100 images - 74.056s cold, 1.257s hot 1000 images - 1439.562s cold, 1446.595s hot At 1000 images the cache stops helping. That's because the reading is done in the same order on the two runs so everything is cache misses. Reverse order on the second run would give 256 cache hits and then 744 cache misses so the cache would help but still be slow. Naturally the penalty for cache misses is much higher for half-size raws. As a collection gets modified in darkroom the average user will be closer to the half-size raw results than the embedded JPEG ones. My full collection right now is 25k images. To be able to fit it in cache at 800x600 max, I would have to dedicate 200MB/256*25k=19,5GB of memory to the cache. That's clearly not feasible. For larger collections or if I wanted to use my 2K screen (or get a 4K one) this would be even worse. On the other hand saving the 0-3 cache levels to disk would take about 10GB (not caching the higher levels) on a 300GB collection, so only a 3% overhead, and then paging through the whole collection in darktable only has to read those 3% from disk and doesn't have to process them at all, so that would be a 30x or more speedup in the lighttable view. The downside of course is having to write out those 10GB to disk. That would happen as files get viewed/changed and only as the in-memory cache spills-over. Comments? Pedro ------------------------------------------------------------------------------ _______________________________________________ darktable-devel mailing list darktable-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/darktable-devel