On Tue, 11 Jan 2011 07:13:47 -0600, Stan Hoeppner wrote: > Camaleón put forth on 1/10/2011 2:11 PM: > >> I used a VM to get the closest environment as you seem to have (a low >> resource machine) and the above command (timed) gives: > > I'm not sure what you mean by resources in this context. My box has > plenty of resources for the task we're discussing. Each convert > process, IIRC, was using 80MB on my system. Only two can run > simultaneously. So why queue up 4 or more processes? That just eats > memory uselessly for zero decrease in total run time.
I supposed you wouldn't care much in getting a script to run faster with all the available core "occupied" if you had a modern (<4 years) cpu and plenty of speedy ram because the routine you wanted to run it should not take many time... unless you were going to process "thousand" of images :-) (...) > I just made two runs on the same set of photos but downsized them to > 800x600 to keep the run time down. (I had you upscale them to 3072x2048 > as your CPUs are much newer) > > $ time for k in *.JPG; do convert $k -resize 800 $k; done > > real 1m16.542s > user 1m11.872s > sys 0m4.104s > > $ time for k in *.JPG; do echo $k; done | xargs -I{} -P2 convert {} > -resize 800 {} > > real 0m41.188s > user 1m14.837s > sys 0m4.812s > > 41s vs 77s = 53% decrease in run time. In this case there is > insufficient memory bandwidth as well. The Intel BX chipset supports a > single channel of PC100 memory for a raw bandwidth of 800MB/s. Image > manipulation programs will eat all available memory b/w. On my system, > running two such processes allows ~400MB/s to each processor socket, > starving the convert program of memory access. > > To get close to _linear_ scaling in this scenario, one would need > something like an 8 core AMD Magny Cours system with quad memory > channels, or whatever the Intel platform is with quad channels. One > would run with xargs -P2, allowing each process ~12GB/s of memory > bandwidth. This should yield a 90-100% decrease in run time. > >> Running more processes than real cores seems fine, did you try it? > > Define "fine". Fine = system not hogging all resources. > Please post the specs of your SUT, both CPU/mem > subsystem and OS environment details (what hypervisor and guest). (SUT > is IBM speak for System Under Test). I didn't know the meaning of that "SUT" term... The test was run in a laptop (Toshiba Tecra A7) with an Intel Core Duo T2400 (in brief, 2M Cache, 1.83 GHz, 667 MHz FSB, full specs¹) and 4 GiB of ram (DDR2). VM is Virtualbox (4.0) with Windows XP Pro as host and Debian Squeeze as guest. VM was setup to use the 2 cores and 1.5 GiB of system ram. Disk controller is emulated via ich6. >>> Linux is pretty efficient at scheduling multiple processes among cores >>> in multiprocessor and/or multi-core systems and achieving near linear >>> performance scaling. This is one reason why "fork and forget" is such >>> a popular method used for parallel programming. All you have to do is >>> fork many children and the kernel takes care of scheduling the >>> processes to run simultaneously. >> >> Yep. It handles the proccesses quite nice. > > Are you "new" to the concept of parallel processing and what CPU process > scheduling is? No... I guess this is quite similar to the way most of the daemons do when running in background and launch several instances (like "amavisd- new" does) but I didn't think there was a direct relation in the number of the running daemons/processes and the cores available in the CPU, I mean, I thought the kernel would automatically handle all the resources available the best it can, regardless of the number of cores in use. ¹http://ark.intel.com/Product.aspx?id=27235 Greetings, -- Camaleón -- To UNSUBSCRIBE, email to debian-user-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/pan.2011.01.11.15.38...@gmail.com