Don - I'm glad someone has looked at these as I'm a firm believer in concrete examples, especially in a field as wide-open as parallelization.
In some sense, though, achieving multi-core use is simpler than you might imagine. Take a look at http://www.jsoftware.com/jwiki/NYCJUG/2010-03-09#ExampleofaSimpleParallelizationfor a proof-of-concept. Here, I simply manually broke the input dataset into two pieces and ran separate J sessions against each piece. This has used both cores every time I've done it and achieved speed-ups of about 40% on an actual problem. I have since elaborated this example to parallelize (need a better verb here) my problem automatically without moving around the files - I'll be presenting my results at the next NYCJUG though I may put something up on the wiki before then. My experiments so far show that simply running multiple J sessions effectively spreads the computational load across the two cores on my machine - this was my suspicion when I created these examples so they have that underlying assumption. This is why I see Case 1 as a contention problem: I'm envisioning a solution where you have multiple, independent sessions reading the inputs and writing the outputs simultaneously. I believe this should be a relatively simple contention issue as you probably only have to worry initially only about two processes working on one or two hundred datasets. The data movement issues are separate but are a big part of any real-life problem - it's come up as an issue here at my work for a grid-computing effort we are undertaking. Regards, Devon On Wed, Mar 17, 2010 at 9:20 AM, Don Guinn <[email protected]> wrote: > Must apologize for not looking back at Case 2 when I previously wrote. I > was > remembering the issue of making tab delimited files, not the image problem. > > On the converting to a tab delimited format, I assumed that even though it > is very compute intensive that it was still faster than the disk. That may > not be true. If untrue then multiple cores would speed things up. But the > system optimization of writes could still mess up any timings. A while back > I was using some standard software to move files around. It started ran > pretty quickly. It ended, but the disk busy light stayed on for ten or > fifteen seconds after the program ended. How long did it really take? Was > it > the five seconds the program ran or the maybe twenty seconds before all the > disk operations completed? > > So about Case 2. Haven't looked at it too closely yet as I wanted to > correct > my earlier remarks. Transposing and rotating arrays, though simple to > understand, are very memory intensive. Cache memory doesn't help much as > any > problem of decent size exceeds the size of cache memory. The CDC 205 had a > special instruction to help. IBM vector processors had a stride option to > help too. > > De-speckling is easy to implement in J. Given the tough part, a verb to > look > at each subarray around each point, compute the new value for the point, > then use it with outfix (\.). > > One major difficulty in writing J code to utilize multiple cores is that J > must become idle for the application code to see interrupts. The J task can > start several other cores (J sessions) on various pieces of the problem, > but > in order to accept the completion of a core task and start the core on > another part, J must become idle. So restructuring a problem to utilize > multiple cores means a total re-write. > > I would like to do multi-core applications something like I saw in Pascal > years ago. A proposed extension called COBEGIN/COEND. > > COBEGIN; > TASK1; > TASK2; > TASK3; > COEND; > > The statements within the block must be mutually independent. The > statements > in the block could be run in any order including overlapping. Can't do that > in J. But doesn't this resemble J's each, rank and outfix? Somehow we need > to be able to do something similar in J. > > I would like to write something like: > > rotatepicture=:3 : 0 > '(rotate fread y)fwrite y' startcore y > ) > > events=:rotatepicture each {."1 fdir '*.jpg' > wait events > > where startcore schedules the statement to be run and returns a semaphore > that is set when the statement completes. Then wait hangs until all the > semaphores are set. Can't do that in J. Or can we? > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > -- Devon McCormick, CFA ^me^ at acm. org is my preferred e-mail ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
