Folks, another heads-up ... I had some old hobby code that would scan folders for image files, extract the details of the file and image and update a SQLite DB to create an "index" of all my images. I haven't run the utility for a few years, but now I have about 4000 images and running it afresh was taking at least 30 minutes to read all the files. It's slow because I have to read all the file bytes to load the Image and take an MD5 hash of newly added files (it's much faster on subsequent scans when it knows most files haven't changed).
Coincidentally, the latest MSDN magazine has an article titled The Past, Present and Future of Parallelizing .NET Applications <http://msdn.microsoft.com/en-us/magazine/hh335070.aspx> which reminded me of the System.Threading.Tasks.Parallel class which has many For and ForEach methods. To use the Parallel class properly you have to discipline yourself to do two things: (1) Make sure the method that does the "work" is IEnumerable (2) Be thread safe by lock[ing] whatever the work method updates (obviously). Thing 1 is the important because you must adjust your coding style to make sure heavyweight methods are IEnumerable. Once you do that you can just go Parallel.ForEach on the method and bingo it just works and it magically runs parallelised. My image scan now takes about 10 minutes and in Task Manager I can see all 6 CPUs pumping electrons. There are apparently simple techniques for cancelling parallelised work, but I haven't tried that yet. Cheers Greg
