Toy Testing SCA performance on a Cluster

Giorgio Zoppi Thu, 20 Dec 2007 03:07:46 -0800

Hi,
I tried something about performace on my SCA's app over
binding-axis2-sca. Basically it was a toy test, I tried to compute
sin(x) 6M times using a stream of serialized jobs with XML; next week
i'm going to try with Serializable. The stream lenght was 1024 jobs,
each one computes sin(x) 6M times.
Well, if i compute 6M times sinx locally, i'll take 390 ms on my
laptop (Core Duo 1,6 GHz, Linux's bogomips 3192.21). Let's see in a
workpool with 4 workers in the same cluster. The first try was to ask
everytime to resolve a CallableReference and create a
WorkerComponent's proxy for each item, whereas the WorkpoolService
doesn't know how many WorkerComponents are in the system
(workercomponents are managed by the WorkpoolManager).
However resolving a CallableReference and creating a proxy for a
remote component it's an expensive operation because it takes


Get service from reference =181 ms.

So for 1024 jobs you loose about 3 minutes in resolving a
CallableReference.The things go worse if you'll have 12 workers (1
worker per host) (a run with 12 workers, that works in this way is
slower than a 8-workers run). So the only feasible way is to resolve
once, and cache it. Moreover a workpool has a good locality.
First question, is there a way to improve CallableReferenceImpl?.
If I put the 4 workers in the same host(4 different jvms) using
reference's caching with a stream of 1024 jobs, I'll have an
AverageTime = 193.9258300223905ms.
The AverageTime is computed in the following way: AverageTime =
(AverageTime +(previousCallTime - actualCallTime)) / #calls (where
*CallTime = System.currentTimesMillis()), before fetching a new job
from a LinkedBlockingQueue. When the stream ends, we've Elapsed Time =
198969 ms, in the cluster the things go a bit slower, about 220ms/job.
In the cluster, when we move from 4 to 8 workers, we see an
AverageTime about 120ms/job. So I suppose that with a 8M sin(x)
iterations run (about 500ms), the clustered enviroment you'd be better
than JVM.
On a single box, using Java5's Executors.newFixedThreadPool(4) and
CompletationService I have an Elapsed Time=183104 ms. Using the same
thing with a different number of workers, I have (Linux
2.6.20-16-generic-SMP):
4 workers-->compute time=183104 ms
8 workers-->compute time=177182ms.
12 workers-->compute time=181378ms.
16 workers-->compute time=180121ms.

In a cluster enviroment I have:
4 workers (a jvm x node) --> Elapsed Time=222043
8 workers (a jvm x node with caching callable references)->Elapsed Time=119140
12 workers (a jvm x node with caching callable references)-> Elapsed Time=90148
In a my SCA runtime, I have a cache in order to avoid the costs due to
the domain's pull model.
Cheers,
Giorgio.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Toy Testing SCA performance on a Cluster

Reply via email to