It seems to me that it should be possible to define a simple numerical 
process that would appear, at least at first glance, to benefit from 
parallel operations. We could use that process to examine the actual 
effects of various parallel implementations, and compare execution times 
to a pure single-process J implementation.

The process itself may or may not have any practical value, but it would 
at least be a benchmark to examine how such parallel mechanisms could be 
implemented, as well as allowing timing comparisons between single and 
multi-threaded implementations.

An obvious characteristic of a process that would appear to benefit from 
parallel processing are algorithms that minimize movement of data 
between processor cores. A set of in-place operations on a fixed set of 
data would seem to have a good chance of showing significant 
efficiencies using parallel techniques.

So I will make a proposal for such a problem: Create a large vector of 
random integers. Perform a large set of sequential operations on those 
individual integers, square them, then take the natural log, then the 
inverse, then the ceiling, etc. Keep going for awhile, to give the 
processing units a workout. To simplify result verification, the final 
output of all the manipulations should be the same as the original vector.

To parallelize this numerical process, the random numbers would be 
divided up equally across the multiple processor cores initially, and no 
more data movement between cores would take place during the processing. 
Subsequent operations on the data items are passed to all of the cores 
sequentially, and the cores each operate on their portions of the data 
using the operations specified. When all the operations are completed, 
the data is re-assembled back to a vector the same size as the original 
vector.

In this process, no data movement is made between processor cores. Only 
process commands are passed to each processor, so it can perform the 
next numerical operation on all of its data items. With totally 
independent processors and separate memory, it would seem that at some 
point, parallel processing would beat out single processes.

In reality, whether this process will benefit from parallelization 
depends on the size of the vector, how many processors, how much memory 
is shared between processors, and what the memory bandwidth is for all 
the processors, among other things. My (often faulty) intuition tells me 
that at some vector size and number of in-place operations, parallel 
execution of this process should still start to become more efficient 
than a single-threaded implementation. However, the complexities of 
cache memory, shared memory bandwidth, and many other factors may prove 
my intuition wrong.

In any case, this example would provide a base benchmark for evaluating 
parallel processes on arrays of data. We could try this algorithm on 
various multi-core, multi threaded, multi processor schemes with 
various  memory architectures, to see  what configurations (if any) 
benefit from parallelization.  Just my two cents....

Skip Cave
.
.
.Raul Miller wrote:
> On Mon, Feb 15, 2010 at 1:11 PM, Don Guinn <[email protected]> wrote:
>   
>> I don't think it would take much to actually move problems once we have
>> other J sessions available. The socket interface it there. 3!:1 and 3!:2
>> provide the tool to transfer the data in a generalized way. Sockets provide
>> easy notification to coordinate between instances of J. Big problems I see
>> are security and being able to start J remotely. I don't know where to start
>> with them.
>>     
>
> To start J remotely you need to create a "J server" which can
> spawn J sessions on that machine.
>
> To do this securely you need to have some sort of secure connection
> mechanism.  The primary risk you need to defend against is
> unauthorized clients.  The ideal mechanism here involves a
> collection of machines which is not connected to the internet.
>
> A slightly weaker version allows some machines to be connected
> to the internet but black listing them, such that they are not
> allowed to be clients to the "J server" (they can be servers
> for other J sessions).
>
> Other variations are also possible (for example, giving clients
> secret keys and using them to generate hashes against
> recent timestamps and executable sentences).  In other
> words:
>
> client and server:
>   hash=: md5sum secret,timestamp,sentence
>
> (hash sent from client must match hash generated on server).
> (server also rejects unreasonable timestamps).
>
> Except, I think we have better options than md5sum.
>
> FYI,
>
>   
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to