There have been a number of discussions about J supporting multiple cores or
multiple racks of processors or whatever. A quad core chip is designed to
handle multiple independent or unrelated tasks concurrently. Say you are
working on your PC building a report then send it to a printer. In the
meantime you start a download of something pretty big off the Internet.
While you are waiting for these things to be done you start writing another
report in a word processor. You now have three unrelated tasks being
performed, plus whatever housekeeping the operating system may be doing. The
multiple cores allow managing these tasks without numerous switches jumping
between the tasks as would have to be done on a single core.

The point is that these applications are unrelated to each other. Spreading
a single application over many processors requires effort and care to
identify pieces of the application that are independent and can run in any
order and/or concurrently on multiple cores. This is where J has a big
advantage over other computer languages. It is an array processing language.
Where things done in loops in other languages are not loops in J. This gives
the J engine the opportunity to easily split the work up. The use of
conjunctions like &  " and @ are obvious candidates for parallel processing.
Scalar verbs like + over large arrays can be distributed as well.

But there are problems associated with splitting up the work. An automated
process identifying and scheduling the pieces to multiple processors takes
time. If the arrays involved are very large, great! If they are small the
scheduling and coordination overhead is not worth it.

Processors now have cache memory to reduce accesses to main memory. Keep the
data close and in a small and very fast memory so the processor doesn't have
to wait on the relatively slow main memory. I don't think this cache is
shared between cores of a multi-core processor. Therefore, there has to be
logic to detect updates when one core modifies a part of memory that happens
to reside in another core's cache as well. When such a conflict is detected
the part of cache affected is invalidated and must be refreshed from main
memory. And that can't take place until the cached data in the first
processor has been stored into main memory. Unless there is some logic to
copy directly between cache memories. In any case, such conflicts greatly
reduce throughput.

Given the big tasks mentioned at the first, they each take a long time and
lots of processing. They also have nothing to do with each other, so
conflicts like cache conflicts are minimal. Not so with J. Only operations
on very large arrays take more than a few milliseconds to complete. The
likelihood of conflicts like cache invalidation is very high as we are
dealing with items of data that reside close together and would likely
reside in the cache of several of the cores. In other words, the current
core multiprocessors are not designed for the kinds of things that J could
split up.

Today even most of the laptops have multi-core chips. There is getting more
and more reasons that J should exploit that design. But it's not one that
will come quickly, partly because it is a difficult problem to efficiently
split up the work in a generalized way, and the current microprocessors are
flat out not designed for this type of processing. Over time, I suspect that
the chips will start bringing back some of the concepts of the big
convolvers and vector processors that have currently fallen out of favor due
to the cheapness and availability of racks of microprocessors. As that
happens, J should be able to exploit these changes without requiring
rewriting of J applications.

In the meantime, what should we do? We should do as systems like Matlab have
done. Supply a toolbox to make it easy to distribute work. I would suggest a
toolbox using sockets to ship, retrieve and coordinate data between multiple
instances of J. Sockets would work over all platforms where J runs. It could
as easily distribute work over multiple cores in one machine or multiple
machines in a network. For really big problems it could go over then
Internet.

Lots of people have already done this. But for specific applications. What
we need is and addon generalized that anyone could use. I'm game. Any more
takers?
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to