Hi all,

I am Thamayanthy Sripalan, an third year undergraduate of University of
Moratuwa.


I am interested in doing this[1] project as my GSoC project as I am keen
interested in learning about clustering and I have enough basic
understanding of the apache axis2, apache ODE, WS-BPEL and  BPEL4WS. As I
guess there are only three features to be implemented in order to cluster
the ODE engine, those are:

1. Having and managing a common database/process store for all the nodes


   -

   If only one database is shared among all the nodes the same process
   definition should be shared among all the nodes. If we allow all the
   databased to deploy the processes, due to versioning problem the running
   instances might get null pointer exception. This will happen because if the
   same process is deployed again, the previous process will go to retired
   state and the newly deployed one will become the active process. So the
   previously created process instances will not be able to find their process
   definition because the deployed process name will be changed when the new
   version of the process is deployed.


   -

   As a solution for this we need to allow only one node(Master node) to
   deploy the processes and other nodes will only read/refer the deployed
   process.


2. Handling assignments of jobs among the nodes


   -

   If one node is running an instance the other node cannot access that
   instance because if two threads/process instances are trying to access the
   same job entry in the database there will be a consistency problem.


   -

   To solve this we can have another database table having the instance_id
   and the node which can handle/execute that job to manage the assigned jobs
   of each nodes. If the job assigned node fails then there should be a
   mechanism to  distribute that node's jobs to the other alive nodes. For
   that we can use Hazelcast (I guess) to handle those things.


3. Handling multi threaded environment


   -

   In case if one process instance has more than one services to be invoked
   in a sequential manner those services can be executed in different nodes.
   So that we need to allow multiple nodes to access the same process
   instance's entry in the database. In this case we cannot restrict that only
   one node can execute/perform the job.


   -

   To handle this one we can use distributed log to execute a job. so that
   only one thread can have the access to a particular job entry. I think that
   this logging mechanism is already implemented in the single node ODE also.
   We need to make sure that the distributed log mechanism is functioning when
   we do clustering.


Also I have commented this in the jira about the features.

I believe that it is feasible to be implemented within the given time frame.

and I would be glad if I am selected to do this project as a GSoC this year.


Can I take the above mentioned features as my project tasks to be achieved?

links:
[1] https://issues.apache.org/jira/browse/ODE-563

Thank you
-- 
Thamayanthy Sripalan
Undergraduate
Department of Computer Science and Engineering
University of Moratuwa.

Reply via email to