On Tue, 2009-09-29 at 22:59 +0200, Mark wrote: > hehe that was the idea indeed ^_^ and i will continue with that. > I will test the large factors tomorrow. > > For now i'm happy with 100% cpu usage on all my cores (4). > with the code posted in my previous message i only had 70% cpu usage > so there was a bottleneck and it wasn't the HDD nor the CPU. > Now that's fixed with giving each thread more then one (5 actually) > images of it's own before locking and refilling the queue of 5 so now > there is 100% cpu usage in the multi threaded benchmark. > > http://codepad.org/PKnp69qW
Ok, i tested this a bit, and my results are not the same as yours. I tested on a directory with 1348 jpeg files, each aroung 5 megapixels, totalling 3.1 gig of data. Before each test I ran (as root): sync; echo 3 > /proc/sys/vm/drop_caches This flushes the caches, for two reasons: make the tests comparable (i.e. same cache status), and to make the test realistic (nobody thumbnails 3 gig of files that are all in the cache). You test is scaling to size 200, which is not the thumbnail size (128), but lets ignore that for now. // GLib Thumbnailing Benchmark There is a bug in the benchmark, where it saves the original pixbuf rather than the thumbnailed one, making this very very slow. When I fixed this i get this timing: real 3m40.876s user 3m19.667s sys 0m2.542s Same test but, using gnome_desktop_thumbnail_scale_down_pixbuf(): real 3m34.784s user 3m13.926s sys 0m2.479s So, for me gnome_desktop_thumbnail_scale_down_pixbuf() is ~3% faster (which makes some sense, as its using a simpler algorithm). Did you compile your benchmark app with full optimization? (since you have an in-line copy of the scale_down_pixbuf function this is required) (The rest of the tests are all run with gdk_pixbuf_scale_simple for easy comparison. // Glib more rapid thumbnailing benchmark real 1m56.650s user 1m24.030s sys 0m2.622s Here we can see that the jpeg loading trick really helps us. //Glib threaded thumbnailing My machine has 2 cores, not 4 as yours. With the default 4 threads: real 2m2.194s user 1m25.437s sys 0m2.982s Changed to use two threads: real 1m53.783s user 1m25.948s sys 0m2.966s If we use the same number of threads as cpus we go slightly faster (approximately 2.6% less time). However, if we use more things are actually slower. I've got 4 gigs of memory, so not everything will fit in the cache, but the caches would probably help a bit, to verify this i ran the same two-thread example without blowing the caches first: real 1m36.681s user 1m21.610s sys 0m2.501s So, slighly better, and we can see that the real time is getting nearer to the user time, which means that less time were spent waiting on disk. However, i'm not sure how interesting a cached benchmark is. Nobody will thumbnail the same files twice. Now, what does this mean for Nautilus.... Well, nautilus loads the files using gdk-pixbuf io-based resizing, which is essentially what "Glib more rapid" does. I.E. it uses the jpeg loading trick and scales using gdk_pixbuf_scale_simple. It calls gnome_desktop_thumbnail_scale_down_pixbuf() only when an external thumbnailer returns an oversize result (i.e. very seldom). Given the above result this is not ideal. The ideal would be to use the jpeg loader trick but then downscale with gnome_desktop_thumbnail_scale_down_pixbuf(), although that is hard to implement given the pixbuf APIs. Nautilus uses only one thread for thumbnailing, and upping this to the number of cpus of the machine could gain us a slight advantage, at the risk of starving the rest of nautilus by the increase in i/o traffic. _______________________________________________ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list