I was finally able to do a full import of my photo sets into darktable. I'm importing a bunch of JPG/MRW/ARW files some with accompanying xmp files. I have about 25k files. What I noticed during the import process was that the time to import a photo grew linearly so the total time to import N photos grows quadratically. Here were my observations:
1000 photos: 56s - 0.056 seconds per image 2000 photos: 2m12s - 0.076 seconds per image 3000 photos: 3m35s - 0.083 seconds per image 4000 photos: 5m6s - 0.091 seconds per image 5000 photos: 6m42s - 0.096 seconds per image 6000 photos: 8m30s - 0.108 seconds per image 7700 photos: 13m3s - 0.161 seconds per image 9000 photos: 16m56s - 0.179 seconds per image 10100 photos: 19m54s - 0.162 seconds per image 11600 photos: 23m52s - 0.159 seconds per image 12700 photos: 27m7s - 0.177 seconds per image 13700 photos: 30m4s - 0.177 seconds per image 15000 photos: 34m11s - 0.19 seconds per image 16000 photos: 37m34s - 0.203 seconds per image 17000 photos: 41m36s - 0.242 seconds per image 18500 photos: 48m57s - 0.294 seconds per image 20000 photos: 55m52s - 0.277 seconds per image 21000 photos: 1h42s - 0.29 seconds per image 22100 photos: 1h6m10s - 0.298 seconds per image 23000 photos: 1h10m32s - 0.291 seconds per image 24000 photos: 1h16m52s - 0.38 seconds per image 24500 photos: 1h19m22s - 0.3 seconds per image A linear regression shows that the time per image is 0.045+N*1.14e-5 (.94 r^2), growing linearly with the number of images already imported (N). This was on a Lenovo X220t running Ubuntu 12.04 inside a Virtualbox VM. Host is Windows 7. Darktable was constantly at 100% CPU (single-threaded, not using the second CPU) while darktable's IO struggled to get above 1MB/s while the disk everything was on can do 30MB/s (USB 2.0 attached, tested with hdparm). Probably something in the import process is doing a pass over all the already imported images for every new image that shows up. This is incredibly inefficient. As an exercise here's how long it will take on my machine to import larger collections: 25000 (mine): 1.3 hours 50000 (2x): 4.6 hours 150000 (filling 2TB HD): 37.8 hours 300000 (filling 4TB HD): 147.3 hours (6.1 days) As far as I can tell this only grows by the size of the import job. If you do 2 25000 imports it will only take 1.3x2 hours not 4.6. If that is indeed the case something is wrong in the import process, probably because the normal use case for it is small 100-200 image rolls at a time. Cheers, Pedro ------------------------------------------------------------------------------ Learn Graph Databases - Download FREE O'Reilly Book "Graph Databases" is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/13534_NeoTech _______________________________________________ Darktable-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/darktable-users
