On Thu, Apr 3, 2014 at 3:41 PM, Pedro Côrte-Real <pe...@pedrocr.net> wrote: > On Wed, Apr 2, 2014 at 9:08 PM, Pedro Côrte-Real <pe...@pedrocr.net> wrote: >> Having read through the code in more detail here's a possible >> suggestion on how to do the minimum possible thing that may just work: >> >> Leave the DT_MIPMAP_F and DT_MIPMAP_FULL levels just as they are. >> For levels DT_MIPMAP_0 through DT_MIPMAP_3: >> 1) whenever an image is about to be removed from the cache write it >> out to disk before >> 2) whenever you have a cache miss try to see if the image is on disk >> before recreating it from the original image >> 3) whenever an image gets changed remove it from the disk >> 4) potentially change the sizes so that DT_MIPMAP_F can be a large >> size and yet the thumbnail levels be smaller (say 800x600 or lower)
Went ahead and built this: https://github.com/pedrocr/darktable/tree/diskcache > Here are the results with thumbnails being calculated from half-size raws > > 10 images - 7.117s cold, 0.505s hot > 100 images - 74.056s cold, 1.257s hot > 1000 images - 1439.562s cold, 1446.595s hot Here are the same results now. It's on the same machine, DT_MIPMAP_1 also with 256 slots but at a sightly lower resolution (as I forgot I had capped the size of DT_MIPMAP_3 to 640x480 vs the 800x600 of the previous test). 10 images - 8.319s cold, 0.833s hot 100 images - 85.701s cold, 1.83s hot 1000 images - 1597.838s cold, 8.637 hot, 1.366 hotter (a third run) So now the cache stays helpful even with a large number of images. We're paying a penalty for that in the cold case by spending the time to store these on disk. The "hotter" case is on a third run when the first two runs have saved all 1000 images to disk. The first run only saves the first 744 and has the other 256 loaded in the memory cache that is serialized to disk on exit. On the second run those 256 are loaded from disk, and then slowly evicted and written to disk as we load back the other 744 from disk. On the third run all 1000 are on disk so no writing is needed. The gist of it is we're paying something like 10-15% of overhead in the cold case to get up to a 1100x speedup in the hot case. This setup has a USB2 attached disk so the overhead is potentially overestimated as disk writing is slower than it should be. On the other hand the second test is not completely comparable because of my 640x480 vs 800x600 screwup. Would love some feedback on the code. Particularly I'm not sure the locking in dt_cache_read_get is still correct around the calls to dt_cache_filebacked_tryget. Since I'm changing the bucket data maybe I need a dt_cache_bucket_write_lock? Or is the fact that at that point we're still setting up the bucket enough that we don't need a write lock? Brian, if you want to test this the github branch should compile cleanly and probably won't eat your data. I wouldn't trust it without good backups though. I've only done very minimal testing. I haven't hooked up the cache invalidation to this either so if you change the size settings it's probably wise to do a "rm -fr ~/.cache/darktable/*" before restarting darktable. Cheers, Pedro ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees _______________________________________________ darktable-devel mailing list darktable-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/darktable-devel