Hi Sudharma/Tammo, Kindly review attached document on my proposed approach. Let me know if you have any concerns or doubts on it.
regards, sathwik On Fri, Jun 5, 2015 at 6:56 PM, Sathwik B P <sathwik...@gmail.com> wrote: > find response inline > > On Wed, Jun 3, 2015 at 10:22 PM, sudharma subasinghe < > suba...@cse.mrt.ac.lk> wrote: > >> Hi, >> >> Sorry for the late reply. I was trying to achieve a proper solution. >> Following is my approach. >> >> 1) I elected master node for deploying purpose. So when a master node goes >> down hazelcast will elect the next oldest node for master. >> > > Perfect > > >> 2) ODEServer can identify whether clustering is enabled or not by getting >> the property value in ode-axis2.properties file. So I introduced new >> property called "ode-axis2.hazelcast.clustering.enabled". If there is no >> clustering enabled server will work as it is. If clustering is enabled, >> cluster will be initialized. >> > > Perfect > > >> 3) In manual deployment its responsibility is taken by the >> DeploymentPoller. So I give the deployment capability to master node,s >> poller by setting "isDeploymentFromODEFileSystemAllowed()" to true.So >> others will not be able to go into check() method in DeploymentPoller. So >> deployment will be done by only the master node. >> > > Perfect > > >> >> 4) In DeploymentWebService, I had to consider few cases.If the deploy >> request goes to the master node, it will deploy the process through web >> service.Others pollers will not go into check() method as they are not >> masters. So master can continue without any involvement of others. >> >> > Perfect > > >> 5) If the deploy request goes to a slave node, it will do up to file >> creation in the file system.Slave will be stopped at that point. As only >> master poller is checking, master can continue from created files in the >> file system. >> > > DeploymentWebService provides synchronous operations. The status of > the operation should be communicated to the calling client in the same call. > DeploymentPoller is a backend thread that goes over each and every > directory under the Deployment directory checking for any changes to > deploy.xml in existing processes and deploy newly added processes. This > process is sequential and time consuming. As the process directories > grows, so does the time taken for execution of the thread. > > Since the request is on a slave node and the processing is done on > master node, how do you check for the completion of the > deployment/undeployment of processes and respond back to the client since > the web service call is a synchronous operation. As DeploymentPoller is > taking a lot of time in processing, your request will time out right. > > >> >> 6) But there was problem with _deploymentUnits in ProcessStoreImpl. Each >> _deploymentUnits stores only what its server has deployed. So think, that >> a >> master node goes down another master node appears.But its >> __deploymentUnits >> does not have dus which has deployed by the earlier master node. Hence it >> will not be able retire earlier version of the process which is deployed >> by >> previous master. So there will two process which are in "ACTIVE" state > > >> 7) To avoid this, I add the ODEServer as an Observer to check when a new >> master is electing, then load all the deployment units out of the store. >> So >> new master node can have all the dus and can retire appropriate version. >> Usually loadAll() is called at the server start-up time. But there is no >> other way to solve this. I tried to use Hazelcast IMap to store all dus >> among all nodes. But it wasn't success as du is not serializable object. >> > >> 8) I figured out that we do not need send cluster message to others as all >> the dus' data are persisted to the shared DB. So each node can take the du >> and retrieve necessary data using already implemented methods in Process >> Store. >> >> 9) But there is an another problem.The axis2 service corresponding to a >> deployed process does not appear on all nodes of the cluster. That is >> because each server add du which is deployed by it to the process >> store.That is why I had use loadAll() when masters are changing. How to >> solve this? >> > > I do appriciate your efforts in understanding the implmentation and > changes that need to be done. You are bang on it. > fireEvent(..) is the method that triggers process activation and > necessary service creation. > > > But with this given apporach from steps 6 to 8, ODE cannot have > atleast 2 Active servers for Load balancing. You are concentrating on only > one active node that will do deployments and cater to process invocations. > We should also think about scaling ODE to multiple servers to handle > load. > > What do you think. > > >> Thank you, >> Sudharma >> >> On 2 June 2015 at 08:51, Sathwik B P <sathwik...@gmail.com> wrote: >> >> > Sudharma, >> > >> > Any updates? >> > >> > regards, >> > sathwik >> > >> > On Fri, May 29, 2015 at 5:26 PM, Sathwik B P <sathwik...@gmail.com> >> wrote: >> > >> > > Sudharma, >> > > >> > > Can you elaborate on your option 1). >> > > >> > > Response to your option 2). >> > > >> > > Process Store is the component that handles process metadata, >> > > compilation and deployment in ODE. Integration layers in ODE (Axis2, >> JBI) >> > > use the process store. >> > > Future implementations of IL for ODE will also use the process >> store. >> > > We should not be thinking of moving the process store functionality to >> > the >> > > integration layers. >> > > >> > > >> > > On Thu, May 28, 2015 at 9:33 PM, sudharma subasinghe < >> > > suba...@cse.mrt.ac.lk> wrote: >> > > >> > >> Hi, >> > >> >> > >> I understood the problem within dynamic master/slave configuration. >> In >> > my >> > >> approach, when a deployment request is routed to a slave node there >> will >> > >> not be a deployment. I suggest two options to avoid it. >> > >> 1) Have static master/slave configuration only for deploy process >> > >> >> > > 2) Modify the deployment web service to complie and verify the process >> > and >> > >> then copy it to the deploy folder irrespective of whether its a >> master >> > or >> > >> slave, then deployment poller should take care of the deployment >> > >> >> > >> >> > > >> > >> >> > >> On 28 May 2015 at 14:43, Sathwik B P <sathwik...@gmail.com> wrote: >> > >> >> > >> > Sudharma, >> > >> > >> > >> > We definitely need a master/slave in the hazelcast cluster. This is >> > >> > probably needed for the job migration in the Scheduler to migrate >> the >> > >> jobs >> > >> > associated with a down node. Let hold on this topic for future >> > >> discussion. >> > >> > >> > >> > Going by the explanation where the master/slave nodes have certain >> > >> > predefined tasks to perform is perfectly fine. >> > >> > >> > >> > I have this scenario, >> > >> > >> > >> > I am using HAProxy as my load balancer and configured 3 nodes in >> the >> > >> > cluster. >> > >> > >> > >> > Node1 - Active >> > >> > Node2 - Active >> > >> > Node3 - Backup >> > >> > >> > >> > Load balancing algorithm: RoundRobin >> > >> > >> > >> > A Backup node (Node3) is one which the load balancer will not route >> > >> > requests to, until one of the Active node i.e either Node1 or Node2 >> > has >> > >> > gone down. >> > >> > >> > >> > All these 3 nodes are also part of the hazelcast cluster as well. >> > >> > >> > >> > In the hazelcast cluster, assume Node1 is elected as the >> leader/master >> > >> and >> > >> > Node2,Node3 as slaves. >> > >> > >> > >> > I initiate the deploy operation on the DeploymentWebService which >> the >> > >> load >> > >> > balancer routes it to one of the Active nodes in the cluster, lets >> say >> > >> it's >> > >> > the Node1. Since Node1 is also the master in the hazelcast cluster, >> > >> > deployment is a success. >> > >> > >> > >> > I initiate another deploy operation on the DeploymentWebService >> which >> > >> the >> > >> > load balancer routes it to the next active node which is Node2. >> Since >> > >> Node2 >> > >> > is a slave in the Hazelcast cluster, What happens to the >> deployment? >> > >> > >> > >> > regards, >> > >> > sathwik >> > >> > >> > >> > On Wed, May 27, 2015 at 10:55 PM, sudharma subasinghe < >> > >> > suba...@cse.mrt.ac.lk >> > >> > > wrote: >> > >> > >> > >> > > Hi, >> > >> > > >> > >> > > I will explain my approach as much as possible. The oldest node >> in >> > the >> > >> > > hazelcast cluster is elected as the master node. In the failure >> of >> > the >> > >> > > master node, next oldest node will be elected as the master node. >> > This >> > >> > > master-slave configuration is just for deployment. When the >> > hazelcast >> > >> > > cluster elected the master node, that node becomes a master node >> for >> > >> > > deploying process. So it will do the deploying artifacts. If you >> > want >> > >> to >> > >> > > get the idea of electing master node please refer the code which >> I >> > >> have >> > >> > > located in the github. ( >> > >> > > https://github.com/Subasinghe/ode/tree/ode_clustering) >> > >> > > >> > >> > > I identified separated actions which should be followed by the >> > master >> > >> and >> > >> > > salve nodes. >> > >> > > Actions which are followed by master node only >> > >> > > 1) create deployment unit >> > >> > > 2) set the version nu to deployment unit >> > >> > > 3) compile deployment unit >> > >> > > 4) scan deployment unit >> > >> > > 5) retire previous versions >> > >> > > Master node and slave nodes should create _processes which stores >> > >> > > ProcessConfImpl >> > >> > > Only master node will write the version nu to database, create >> > >> .deployed >> > >> > > file >> > >> > > >> > >> > > So there are some actions which should be followed only by master >> > node >> > >> > > while other actions should be followed by all the nodes.The idea >> of >> > >> > having >> > >> > > a master node is deploying artifacts and avoid others from >> writing >> > the >> > >> > > version nu to database. >> > >> > > Whether a node is active or passive, all nodes should do the >> > >> > > deployment.Master >> > >> > > and slaves will follow necessary actions as in above. >> > >> > > >> > >> > > >> > >> > > >> > >> > > >> > >> > > >> > >> > > On 27 May 2015 at 15:49, Sathwik B P <sathwik...@gmail.com> >> wrote: >> > >> > > >> > >> > > > Nandika, >> > >> > > > >> > >> > > > I very well understand what you have put across, but it's >> > secondary >> > >> to >> > >> > me >> > >> > > > now. >> > >> > > > >> > >> > > > Sudharma, >> > >> > > > My primary concern is to understand at a high level the >> deployment >> > >> > > > architecture and how would master-slave configuration fit in. >> Are >> > >> there >> > >> > > any >> > >> > > > restrictions imposed by the in-progress design? >> > >> > > > >> > >> > > > Firstly, how would ODE process deployment work under these >> cluster >> > >> > > > configurations? >> > >> > > > >> > >> > > > Sample Cluster configurations: A load balancer is frontending >> the >> > >> > > servers. >> > >> > > > 1) Cluster consisting of 2 nodes all Active-Active. >> > >> > > > 2) Cluster consisting of 2 nodes Active-Passive. >> > >> > > > 3) Cluster with 2+ nodes with additional nodes either in >> Active or >> > >> > > Passive. >> > >> > > > >> > >> > > > regards, >> > >> > > > sathwik >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > On Wed, May 27, 2015 at 3:04 PM, Nandika Jayawardana < >> > >> > jayaw...@gmail.com >> > >> > > > >> > >> > > > wrote: >> > >> > > > >> > >> > > > > Hi Sathwik, >> > >> > > > > >> > >> > > > > According to my understanding, in the clustering scenario, >> the >> > >> master >> > >> > > > node >> > >> > > > > should perform all the deployment actions and the slave nodes >> > also >> > >> > need >> > >> > > > to >> > >> > > > > perform some deployment actions. For example, the slave nodes >> > also >> > >> > > should >> > >> > > > > handle the process ACTIVATED event so that the process >> > >> configuration >> > >> > is >> > >> > > > > added to the engine and necessary web services are created so >> > that >> > >> > when >> > >> > > > the >> > >> > > > > load balancer send requests to any node in the cluster, it is >> > >> ready >> > >> > to >> > >> > > > > accept those requests. >> > >> > > > > >> > >> > > > > Regards >> > >> > > > > Nandika >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > On Wed, May 27, 2015 at 12:30 PM, Sathwik B P < >> > >> sathwik...@gmail.com> >> > >> > > > > wrote: >> > >> > > > > >> > >> > > > > > Sudharma, >> > >> > > > > > >> > >> > > > > > Where are you going to configure the master-slaves, is it >> in >> > the >> > >> > web >> > >> > > > > > application level or at the load balancer? >> > >> > > > > > >> > >> > > > > > regards, >> > >> > > > > > sathwik >> > >> > > > > > >> > >> > > > > > On Tue, May 26, 2015 at 7:42 PM, sudharma subasinghe < >> > >> > > > > > suba...@cse.mrt.ac.lk> >> > >> > > > > > wrote: >> > >> > > > > > >> > >> > > > > > > Hi Tammo, >> > >> > > > > > > >> > >> > > > > > > Can you suggest the best method from these to implement? >> As >> > >> > first I >> > >> > > > > > > suggested the master-slaves scenario I think it is easy >> to >> > >> > > implement >> > >> > > > > than >> > >> > > > > > > distributed lock scenario. However if you can suggest one >> > from >> > >> > > these >> > >> > > > > two, >> > >> > > > > > > then I can think about it. >> > >> > > > > > > >> > >> > > > > > > Thank you >> > >> > > > > > > >> > >> > > > > > > On 21 May 2015 at 12:40, Sathwik B P < >> sathwik...@gmail.com> >> > >> > wrote: >> > >> > > > > > > >> > >> > > > > > > > With respect to the hotdeployment, >> > >> > > > > > > > >> > >> > > > > > > > We can drop the deployment archive onto the deployment >> > >> folder. >> > >> > > > Since >> > >> > > > > > the >> > >> > > > > > > > DeploymentPoller are acquiring the distributed lock for >> > the >> > >> > > > > > > DeploymentUnit, >> > >> > > > > > > > only one of the nodes will get the lock and initiate >> the >> > >> > > > deployment. >> > >> > > > > > > > DeploymentPollers on other nodes will fail in acquiring >> > the >> > >> > lock >> > >> > > > and >> > >> > > > > > > hence >> > >> > > > > > > > will silently ignore it. >> > >> > > > > > > > >> > >> > > > > > > > On Thu, May 21, 2015 at 12:30 PM, Sathwik B P < >> > >> > > > sathwik...@gmail.com> >> > >> > > > > > > > wrote: >> > >> > > > > > > > >> > >> > > > > > > > > Hi Tammo, >> > >> > > > > > > > > >> > >> > > > > > > > > The distributed lock acquisition on the >> DeploymentUnit >> > >> should >> > >> > > be >> > >> > > > > > added >> > >> > > > > > > to >> > >> > > > > > > > > both DeploymentWebService and DeploymentPoller. >> > >> > > > > > > > > >> > >> > > > > > > > > When a deployment operation is initiated through the >> > >> > > > > > > > DeploymentWebService, >> > >> > > > > > > > > The load balancer routes it to any of the available >> > nodes. >> > >> > > > > > > > > >> > >> > > > > > > > > On the routed node, the DeploymentWebService acquires >> > the >> > >> > > > > Distributed >> > >> > > > > > > > > lock. On the remaining nodes the DeploymentPoller >> will >> > >> try to >> > >> > > > > acquire >> > >> > > > > > > the >> > >> > > > > > > > > distributed lock and will not get it and hence will >> > >> silently >> > >> > > > ignore >> > >> > > > > > it. >> > >> > > > > > > > > >> > >> > > > > > > > > Once the routed node completes the deployment, it >> will >> > >> > release >> > >> > > > the >> > >> > > > > > > lock. >> > >> > > > > > > > > This way we don't have to stall the DeploymentPoller >> in >> > >> other >> > >> > > > > nodes. >> > >> > > > > > > > > >> > >> > > > > > > > > Does it answer the concerns? >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > > Now, if we give the responsibility of identifying the >> > >> master >> > >> > > node >> > >> > > > > to >> > >> > > > > > > the >> > >> > > > > > > > > hazelcast, how do we plan to intimate the load >> balancer >> > to >> > >> > > change >> > >> > > > > > it's >> > >> > > > > > > > > configuration about the master node? >> > >> > > > > > > > > Assuming there are 3 nodes in the cluster, >> > >> > > > > > > > > node1 -master >> > >> > > > > > > > > node2 - slave >> > >> > > > > > > > > node3 - slave >> > >> > > > > > > > > >> > >> > > > > > > > > Node1 goes down, the LB will promote Node2 as master >> > node, >> > >> > but >> > >> > > > > > > hazelcast >> > >> > > > > > > > > might promote Node3 as master node. They are out of >> > sync. >> > >> > > > > > > > > >> > >> > > > > > > > > Is this argument valid? >> > >> > > > > > > > > >> > >> > > > > > > > > regards, >> > >> > > > > > > > > sathwik >> > >> > > > > > > > > >> > >> > > > > > > > > On Wed, May 20, 2015 at 1:51 PM, Tammo van Lessen < >> > >> > > > > > > tvanles...@gmail.com> >> > >> > > > > > > > > wrote: >> > >> > > > > > > > > >> > >> > > > > > > > >> Hi Sudharma, >> > >> > > > > > > > >> >> > >> > > > > > > > >> what do you expect from the "other nodes >> deployment"? >> > >> > > > Compilation >> > >> > > > > is >> > >> > > > > > > not >> > >> > > > > > > > >> needed since the CBP file is written to the (shared) >> > FS. >> > >> > > > > > Registration >> > >> > > > > > > is >> > >> > > > > > > > >> also not needed, since it is done via the shared >> > >> database. >> > >> > So >> > >> > > > the >> > >> > > > > > only >> > >> > > > > > > > >> thing that might be needed is to tell the engine >> that >> > >> there >> > >> > > is a >> > >> > > > > new >> > >> > > > > > > > >> deployment. I'd need to check that. If this is >> needed, >> > I >> > >> > > revert >> > >> > > > my >> > >> > > > > > > last >> > >> > > > > > > > >> statement, then it is perhaps better to just send an >> > >> event >> > >> > > over >> > >> > > > > > > > Hazelcast >> > >> > > > > > > > >> to all nodes that the deployment has changed. >> > >> > > > > > > > >> >> > >> > > > > > > > >> Best, >> > >> > > > > > > > >> Tammo >> > >> > > > > > > > >> >> > >> > > > > > > > >> On Wed, May 20, 2015 at 10:13 AM, sudharma >> subasinghe < >> > >> > > > > > > > >> suba...@cse.mrt.ac.lk >> > >> > > > > > > > >> > wrote: >> > >> > > > > > > > >> >> > >> > > > > > > > >> > Hi Tammo, >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > The master node writes meta data. But runtime >> > >> information >> > >> > > must >> > >> > > > > be >> > >> > > > > > > > >> available >> > >> > > > > > > > >> > in all nodes.Since the folder is shared, all nodes >> > will >> > >> > see >> > >> > > > the >> > >> > > > > > > > >> > availability of a new process. My idea is for >> master >> > >> node >> > >> > to >> > >> > > > > write >> > >> > > > > > > the >> > >> > > > > > > > >> meta >> > >> > > > > > > > >> > data and other nodes to just read the meta data >> and >> > >> load >> > >> > > > > > process.So >> > >> > > > > > > we >> > >> > > > > > > > >> need >> > >> > > > > > > > >> > a small delay between master node deployment and >> > other >> > >> > nodes >> > >> > > > > > > > deployment. >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > Is there anyway to set the delay between master >> node >> > >> and >> > >> > > > slaves >> > >> > > > > > > until >> > >> > > > > > > > >> > master node finish the deployment? >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > Thank you >> > >> > > > > > > > >> > Sudharma >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > On 20 May 2015 at 13:01, Tammo van Lessen < >> > >> > > > tvanles...@gmail.com >> > >> > > > > > >> > >> > > > > > > > wrote: >> > >> > > > > > > > >> > >> > >> > > > > > > > >> > > Hi Sathwik, >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > On Wed, May 20, 2015 at 6:40 AM, Sathwik B P < >> > >> > > > > > > sathwik...@gmail.com> >> > >> > > > > > > > >> > wrote: >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > > Sudharma/Tammo, >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > > 1) How do we plan to decide which is the >> master >> > >> node >> > >> > in >> > >> > > > the >> > >> > > > > > > > cluster? >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > I think the easiest approach is to always elect >> the >> > >> > oldest >> > >> > > > > node >> > >> > > > > > in >> > >> > > > > > > > the >> > >> > > > > > > > >> > > cluster to be the master. AFAIK Hazelcast can >> > easily >> > >> > asked >> > >> > > > for >> > >> > > > > > > this >> > >> > > > > > > > >> > > information. >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > > 2) Don't we need to stall the Deployment >> Pollers >> > in >> > >> > the >> > >> > > > > slave >> > >> > > > > > > > nodes? >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > Absolutely. >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > Suggestion: >> > >> > > > > > > > >> > > > I am not sure whether do we need >> Master-SLaves. >> > Why >> > >> > not >> > >> > > > give >> > >> > > > > > > every >> > >> > > > > > > > >> node >> > >> > > > > > > > >> > > in >> > >> > > > > > > > >> > > > the cluster the same status (Active-Active). >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > > When a new deployment is made, the load >> balancer >> > >> can >> > >> > > push >> > >> > > > it >> > >> > > > > > to >> > >> > > > > > > > any >> > >> > > > > > > > >> of >> > >> > > > > > > > >> > > the >> > >> > > > > > > > >> > > > available nodes. That node will probably >> acquire >> > a >> > >> > > > > distributed >> > >> > > > > > > > lock >> > >> > > > > > > > >> on >> > >> > > > > > > > >> > > the >> > >> > > > > > > > >> > > > deployment unit and acts as master for that >> > >> > deployment. >> > >> > > > This >> > >> > > > > > > > ensures >> > >> > > > > > > > >> > > > optimum usage of the cluster nodes. Probably >> no >> > >> static >> > >> > > > > > > > >> configuration of >> > >> > > > > > > > >> > > > Master-Slave in the load balancer nor in the >> > >> > hazelcast. >> > >> > > > > > > > >> > > > >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > But this would not allow to have the >> hotdeployment >> > >> via >> > >> > > > > > filesystem >> > >> > > > > > > > >> still >> > >> > > > > > > > >> > > enabled, right? >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > Best, >> > >> > > > > > > > >> > > Tammo >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > > -- >> > >> > > > > > > > >> > > Tammo van Lessen - http://www.taval.de >> > >> > > > > > > > >> > > >> > >> > > > > > > > >> > >> > >> > > > > > > > >> >> > >> > > > > > > > >> >> > >> > > > > > > > >> >> > >> > > > > > > > >> -- >> > >> > > > > > > > >> Tammo van Lessen - http://www.taval.de >> > >> > > > > > > > >> >> > >> > > > > > > > > >> > >> > > > > > > > > >> > >> > > > > > > > >> > >> > > > > > > >> > >> > > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > > >> > >> > >