RE: [Jgeneral] speeding up J

Oleg Kobchenko Thu, 15 Feb 2007 23:31:57 -0800

This brings about a more general area of distributed
computing. It has been addressed in many different ways:
custom solutions (like below), COM+ and ORBs app servers,
message queues, load balancers, more recently .NET Workflow
foundation and Process Management BPEL solutions.


One of the misconceptions is treating the problem on a
tools choice basis instead of with a system approach, resulting in
conflict of interest and vendor locks. As a result, simple
and efficient solutions are overlooked and overdesign occurs.

For example, many distributed problems can be implemented
with a very simple and familiar tool: a web server, which is 
available everywhere, more reliable, understood, instrumented, 
etc. than any other distributed system. In fact, it has been
discovered for a good use: Web Services, but which again
are starting grow over with complicated protocols, sub-standards
and library tools with drown the simplicity in layers of
vendor-based "technologies", "open" or Java "standards" included.

The C++/UDP is probably neat, but for a few-minute tasks,
the subsecond overhead of J CGI invokation is not a problem.
With async client web calls (like HTTPXml as in AJAX--another 
tool-abused solution), not only can several instances get
launched, but completion syncronized and results collected.

(Note, the example below is not so much about speed up
J accross CPUs, as compensating for latencies of external I/O.)

Notwithstanding, the C++/UDP would still be a good approach
in case of a situation with numerous request per second.

A final note, although C++ (another tool) bears the spell 
of low-level access and high performance, J in most cases 
can do the same in a comparable time frame.


--- Alex Rufon <[EMAIL PROTECTED]> wrote:

> I badly needed a "parallel" capability in J a year ago. My MRP
> calculation was taking 40 minutes to complete. The whole process is
> actually 8 verbs that are needs to be executed. Each of the methods was
> independent of each other and does not depend on any prior method to
> execute. 
> 
> My solution to this was Sockets using UDP one way invocation. So I got
> two PC's running (you can do this on the same machine); the web server
> which holds the "client" and the application server which where I wrote
> a C++ service which accepts UDP messages and creates a new instance of
> J. I basically gave each method its own J session and run it
> independently. It worked and the whole thing on the server completes in
> a few minutes while the client think its done already. :P
> 
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Skip Cave
> Sent: Wednesday, February 14, 2007 5:45 AM
> To: General forum
> Subject: Re: [Jgeneral] speeding up J
> 
> Raul Miller wrote:
> > What does "first" mean in the context of "parallel"?  (Does it mean
> > anything different than "randomly choosen"?)
> >   
> Skip says:
> 
> In the case of my "parallel" operation definition, "first" has a very 
> specific meaning. The parallel function I described is defined such that
> 
> it is impossible to start two or more parallel operations at the same 
> time. Of course, many parallel operations can be _executing_ at the same
> 
> time, but each of the operations must start in a sequential manner. By 
> definition, all parallel tasks started by the parallel function will 
> have a well defined, and well known, order of execution start. This 
> establishes a "priority" of parallel tasks, which helps to prevent the 
> types of conflicts that can arise when multiple tasks attempt to work on
> 
> the same data at the same time. So when I say "first", I mean the order 
> of execution start, which defines priority, nothing more. In your
> example:
> 
>    'A B C'=: exp"0(4 4 4)
>    parallel each A;B;C
> 
> The order of starting of the functions A, B, and C  must be defined. If 
> we say the order is start A, then start B, then start C, that defines 
> the start order. The actual delays between the consecutive starts could 
> be a few nanoseconds. Thus if A is started first, then any reference to 
> variables accessed or modified in A by B or C would be blocked until A 
> completed operations on that variable. It would be the job of the 
> interpreter to make sure this rule was enforced. If there are common 
> variables, the B & C processes would have to wait on the A process to 
> finish operating on the shared variable before they could operate on it.
> 
> Raul Miller wrote:
> > For that matter, what does "lock" mean?  Does that mean that any
> attempt
> > to read that variable must wait until the other expression is
> complete?
> >   
> Skip says:
> The lower-priority  parallel process doesn't necessarily have to wait 
> until the higher priority process has completely finished to access a 
> shared variable, but the lower task at least must wait until the 
> higher-priority process has completely finished with its work on the 
> shared variable, before it can proceed. .
> 
> Raul Miller wrote:
> > Or does it mean that writes to the variable are atomic?  
> Skip says:
> No. Atomic writes would be a much more complex problem to prevent
> conflicts
> 
> Raul Miller wrote:
> 
> [Do you see how these can be different?]  Or does it mean something
> else?  
> 
> Skip says:
> Yes, they are obviously different, and the second one is prohibited.
> 
> 
> Raul Miller wrote:
> > What happens when thread A is waiting on thread B, but thread B is
> waiting on
> > thread A?
> Skip says:
> This will never happen with the parallel function, because the "first" 
> started parallel task, in this case A, will always preempt the other 
> started tasks. A will never wait on B, as A has priority on all variable
> 
> operations that may conflict between A & B (or C). B will always wait on
> 
> A if there are shared data operations because of this priority. 
> Similarly, B will never wait on C, as B has priority on all variable 
> operations that may conflict between B & C. C, being the low man on the 
> totem pole, will have to wait on anybody who is working on a variable 
> that they share in common.
> 
> This encourages the programmer to break up the data under work in such a
> 
> way to prevent shared access of the same data, if the programmer wants 
> to optimize parallelism. Otherwise, the program will run sequentially, 
> with little or no gain in execution speed. I learned this priority trick
> 
> to prevent parallel conflicts years ago, working on  real-time parallel
> 
> schemes for character recognition engines.
> 
> > Raul Miller wrote:
> > Consider:
> >
> >  a=:a,b
> >
> > What happens here in a parallel context?  (Remember that J allocates
> > up to double the strictly necessary space for arrays to make
> > this expression fast in the typical case.)
> >   
> Skip says:
> 
> Since this example does not use the parallel function, there will be no 
> parallel execution of the assignment.
> >>> (Would it acceptable to make J substantially slower in
> >>> contexts where it's currently fast so that it can run multiple
> >>> threads?)
> >>>       
> >>   
> >> No. The interpreter should be designed to attempt parallelization
> >> only if it will significantly speed up the execution of a
> >> specific primitive. 
> >>     
> >
> > I thought we were discussing your "parallel" operation, hinted at
> > above?  Are you saying that certain primitives would throw an
> > error if they were used in one of these threads?
> >   
> You are justified in being confused here. I was mixing discussions. My 
> response was about the parallelization of primitives. If primitives in J
> 
> were to support parallelism, then they should only resort to the 
> parallelism when it benefited performance, and not impact performance if
> 
> parallelism were not used. The parallelism should be invisible, and no 
> error should be thrown if parallelism was not deemed efficient enough to
> 
> use by the interpreter. A slight degradation in overall J performance (<
> 
> 0.1%) would be acceptable to deal with the overhead in making the 
> parallel/no-parallel decision.
> 
> Moving to the coarse parallelism discussion, the parallel operation 
> should be a separate function in J, and should not impact the execution 
> of any code that was not requested to be run in parallel by the 
> programmer, using the parallel function..
> 
> Skip Cave



 
____________________________________________________________________________________
Bored stiff? Loosen up... 
Download and play hundreds of games for free on Yahoo! Games.
http://games.yahoo.com/games/front
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

RE: [Jgeneral] speeding up J

Reply via email to