Hi Sathwik, I modified the code as you explained. You can go through the code by following link. https://github.com/Subasinghe/ode/tree/ODECluster
It would be helpful if you can provide a feedback on this. Thank you. On 10 June 2015 at 12:39, sudharma subasinghe <suba...@cse.mrt.ac.lk> wrote: > Hi Sathwik, > > I tried to implement your logic also. Following is the link for committed > code upto now. > https://github.com/Subasinghe/ode/tree/ODECluster > > I used info msgs to debugging as it can be observed easily in console. > When releasing the lock acquired by poller or web service the state is set > to false as in my code. But sometimes it is set to true and at that time > second server's value is also true. Following is the related log for each > server. > > Server 1 > 02:12:02,957 INFO [DeploymentPoller] Trying to access the lock for > MagicSession > 02:12:02,962 INFO [HazelcastClusterImpl] ThreadID:63 duLocked value for > MagicSession file after locking: true > 02:12:03,838 INFO [BpelServerImpl] Registered process { > http://ode/bpel/unit-test}MagicSessionMain-3. > 02:12:04,015 INFO [BpelServerImpl] Registered process { > http://ode/bpel/responder}MagicSessionResponder-3. > 02:12:04,015 INFO [DeploymentPoller] Deployment of artifact MagicSession > successful: [{http://ode/bpel/unit-test}MagicSessionMain-3, { > http://ode/bpel/responder}MagicSessionResponder-3] > 02:12:04,015 INFO [DeploymentPoller] Trying to release the lock for > MagicSession > 02:12:04,017 INFO [HazelcastClusterImpl] ThreadID:63 duLocked value for > MagicSession file after unlocking: true > > Server 2 > 02:12:03,119 INFO [DeploymentPoller] Trying to access the lock for > MagicSession > 02:12:04,017 INFO [HazelcastClusterImpl] ThreadID:61 duLocked value for > MagicSession file after locking: true > 02:12:04,017 INFO [DeploymentPoller] Trying to release the lock for > MagicSession > 02:12:04,019 INFO [HazelcastClusterImpl] ThreadID:61 duLocked value for > MagicSession file after unlocking: false > > Is this possible? But there were not any conflicts while deploying. It > worked perfectly. > > Thank you > > > On 7 June 2015 at 11:10, sudharma subasinghe <suba...@cse.mrt.ac.lk> > wrote: > >> Hi Sathwik, >> >> I will concern your approach. Thank you for your effort. >> >> On 6 June 2015 at 18:43, Sathwik B P <sathwik...@gmail.com> wrote: >> >>> I have attached to the JIRA >>> https://issues.apache.org/jira/browse/ODE-563 >>> >>> On Sat, Jun 6, 2015 at 5:55 PM, sudharma subasinghe < >>> suba...@cse.mrt.ac.lk> >>> wrote: >>> >>> > Hi Sathwik, >>> > >>> > I can't find the attached document. Please kind be enough to resend it. >>> > >>> > On 6 June 2015 at 15:42, Sathwik B P <sathwik...@gmail.com> wrote: >>> > >>> > > Refer this one as I have corrected the numbering of steps. >>> > > >>> > > On Sat, Jun 6, 2015 at 3:33 PM, Sathwik B P <sathwik...@gmail.com> >>> > wrote: >>> > > >>> > >> Hi Sudharma/Tammo, >>> > >> >>> > >> Kindly review attached document on my proposed approach. Let me >>> know if >>> > >> you have any concerns or doubts on it. >>> > >> >>> > >> regards, >>> > >> sathwik >>> > >> >>> > >> On Fri, Jun 5, 2015 at 6:56 PM, Sathwik B P <sathwik...@gmail.com> >>> > wrote: >>> > >> >>> > >>> find response inline >>> > >>> >>> > >>> On Wed, Jun 3, 2015 at 10:22 PM, sudharma subasinghe < >>> > >>> suba...@cse.mrt.ac.lk> wrote: >>> > >>> >>> > >>>> Hi, >>> > >>>> >>> > >>>> Sorry for the late reply. I was trying to achieve a proper >>> solution. >>> > >>>> Following is my approach. >>> > >>>> >>> > >>>> 1) I elected master node for deploying purpose. So when a master >>> node >>> > >>>> goes >>> > >>>> down hazelcast will elect the next oldest node for master. >>> > >>>> >>> > >>> >>> > >>> Perfect >>> > >>> >>> > >>> >>> > >>>> 2) ODEServer can identify whether clustering is enabled or not by >>> > >>>> getting >>> > >>>> the property value in ode-axis2.properties file. So I introduced >>> new >>> > >>>> property called "ode-axis2.hazelcast.clustering.enabled". If >>> there is >>> > no >>> > >>>> clustering enabled server will work as it is. If clustering is >>> > enabled, >>> > >>>> cluster will be initialized. >>> > >>>> >>> > >>> >>> > >>> Perfect >>> > >>> >>> > >>> >>> > >>>> 3) In manual deployment its responsibility is taken by the >>> > >>>> DeploymentPoller. So I give the deployment capability to master >>> node,s >>> > >>>> poller by setting "isDeploymentFromODEFileSystemAllowed()" to >>> true.So >>> > >>>> others will not be able to go into check() method in >>> DeploymentPoller. >>> > >>>> So >>> > >>>> deployment will be done by only the master node. >>> > >>>> >>> > >>> >>> > >>> Perfect >>> > >>> >>> > >>> >>> > >>>> >>> > >>>> 4) In DeploymentWebService, I had to consider few cases.If the >>> deploy >>> > >>>> request goes to the master node, it will deploy the process >>> through >>> > web >>> > >>>> service.Others pollers will not go into check() method as they >>> are not >>> > >>>> masters. So master can continue without any involvement of others. >>> > >>>> >>> > >>>> >>> > >>> Perfect >>> > >>> >>> > >>> >>> > >>>> 5) If the deploy request goes to a slave node, it will do up to >>> file >>> > >>>> creation in the file system.Slave will be stopped at that point. >>> As >>> > only >>> > >>>> master poller is checking, master can continue from created files >>> in >>> > the >>> > >>>> file system. >>> > >>>> >>> > >>> >>> > >>> DeploymentWebService provides synchronous operations. The >>> status of >>> > >>> the operation should be communicated to the calling client in the >>> same >>> > call. >>> > >>> DeploymentPoller is a backend thread that goes over each and >>> every >>> > >>> directory under the Deployment directory checking for any changes >>> to >>> > >>> deploy.xml in existing processes and deploy newly added processes. >>> This >>> > >>> process is sequential and time consuming. As the process >>> directories >>> > >>> grows, so does the time taken for execution of the thread. >>> > >>> >>> > >>> Since the request is on a slave node and the processing is >>> done on >>> > >>> master node, how do you check for the completion of the >>> > >>> deployment/undeployment of processes and respond back to the client >>> > since >>> > >>> the web service call is a synchronous operation. As >>> DeploymentPoller is >>> > >>> taking a lot of time in processing, your request will time out >>> right. >>> > >>> >>> > >>> >>> > >>>> >>> > >>>> 6) But there was problem with _deploymentUnits in >>> ProcessStoreImpl. >>> > Each >>> > >>>> _deploymentUnits stores only what its server has deployed. So >>> think, >>> > >>>> that a >>> > >>>> master node goes down another master node appears.But its >>> > >>>> __deploymentUnits >>> > >>>> does not have dus which has deployed by the earlier master node. >>> Hence >>> > >>>> it >>> > >>>> will not be able retire earlier version of the process which is >>> > >>>> deployed by >>> > >>>> previous master. So there will two process which are in "ACTIVE" >>> state >>> > >>> >>> > >>> >>> > >>>> 7) To avoid this, I add the ODEServer as an Observer to check >>> when a >>> > new >>> > >>>> master is electing, then load all the deployment units out of the >>> > >>>> store. So >>> > >>>> new master node can have all the dus and can retire appropriate >>> > version. >>> > >>>> Usually loadAll() is called at the server start-up time. But >>> there is >>> > no >>> > >>>> other way to solve this. I tried to use Hazelcast IMap to store >>> all >>> > dus >>> > >>>> among all nodes. But it wasn't success as du is not serializable >>> > >>>> object. >>> > >>>> >>> > >>> >>> > >>>> 8) I figured out that we do not need send cluster message to >>> others as >>> > >>>> all >>> > >>>> the dus' data are persisted to the shared DB. So each node can >>> take >>> > the >>> > >>>> du >>> > >>>> and retrieve necessary data using already implemented methods in >>> > Process >>> > >>>> Store. >>> > >>>> >>> > >>>> 9) But there is an another problem.The axis2 service >>> corresponding to >>> > a >>> > >>>> deployed process does not appear on all nodes of the cluster. >>> That is >>> > >>>> because each server add du which is deployed by it to the process >>> > >>>> store.That is why I had use loadAll() when masters are changing. >>> How >>> > to >>> > >>>> solve this? >>> > >>>> >>> > >>> >>> > >>> I do appriciate your efforts in understanding the >>> implmentation and >>> > >>> changes that need to be done. You are bang on it. >>> > >>> fireEvent(..) is the method that triggers process activation >>> and >>> > >>> necessary service creation. >>> > >>> >>> > >>> >>> > >>> But with this given apporach from steps 6 to 8, ODE cannot have >>> > >>> atleast 2 Active servers for Load balancing. You are concentrating >>> on >>> > only >>> > >>> one active node that will do deployments and cater to process >>> > invocations. >>> > >>> We should also think about scaling ODE to multiple servers to >>> > handle >>> > >>> load. >>> > >>> >>> > >>> What do you think. >>> > >>> >>> > >>> >>> > >>>> Thank you, >>> > >>>> Sudharma >>> > >>>> >>> > >>>> On 2 June 2015 at 08:51, Sathwik B P <sathwik...@gmail.com> >>> wrote: >>> > >>>> >>> > >>>> > Sudharma, >>> > >>>> > >>> > >>>> > Any updates? >>> > >>>> > >>> > >>>> > regards, >>> > >>>> > sathwik >>> > >>>> > >>> > >>>> > On Fri, May 29, 2015 at 5:26 PM, Sathwik B P < >>> sathwik...@gmail.com> >>> > >>>> wrote: >>> > >>>> > >>> > >>>> > > Sudharma, >>> > >>>> > > >>> > >>>> > > Can you elaborate on your option 1). >>> > >>>> > > >>> > >>>> > > Response to your option 2). >>> > >>>> > > >>> > >>>> > > Process Store is the component that handles process >>> metadata, >>> > >>>> > > compilation and deployment in ODE. Integration layers in ODE >>> > >>>> (Axis2, JBI) >>> > >>>> > > use the process store. >>> > >>>> > > Future implementations of IL for ODE will also use the >>> process >>> > >>>> store. >>> > >>>> > > We should not be thinking of moving the process store >>> > functionality >>> > >>>> to >>> > >>>> > the >>> > >>>> > > integration layers. >>> > >>>> > > >>> > >>>> > > >>> > >>>> > > On Thu, May 28, 2015 at 9:33 PM, sudharma subasinghe < >>> > >>>> > > suba...@cse.mrt.ac.lk> wrote: >>> > >>>> > > >>> > >>>> > >> Hi, >>> > >>>> > >> >>> > >>>> > >> I understood the problem within dynamic master/slave >>> > >>>> configuration. In >>> > >>>> > my >>> > >>>> > >> approach, when a deployment request is routed to a slave node >>> > >>>> there will >>> > >>>> > >> not be a deployment. I suggest two options to avoid it. >>> > >>>> > >> 1) Have static master/slave configuration only for deploy >>> process >>> > >>>> > >> >>> > >>>> > > 2) Modify the deployment web service to complie and verify the >>> > >>>> process >>> > >>>> > and >>> > >>>> > >> then copy it to the deploy folder irrespective of whether >>> its a >>> > >>>> master >>> > >>>> > or >>> > >>>> > >> slave, then deployment poller should take care of the >>> deployment >>> > >>>> > >> >>> > >>>> > >> >>> > >>>> > > >>> > >>>> > >> >>> > >>>> > >> On 28 May 2015 at 14:43, Sathwik B P <sathwik...@gmail.com> >>> > wrote: >>> > >>>> > >> >>> > >>>> > >> > Sudharma, >>> > >>>> > >> > >>> > >>>> > >> > We definitely need a master/slave in the hazelcast cluster. >>> > This >>> > >>>> is >>> > >>>> > >> > probably needed for the job migration in the Scheduler to >>> > >>>> migrate the >>> > >>>> > >> jobs >>> > >>>> > >> > associated with a down node. Let hold on this topic for >>> future >>> > >>>> > >> discussion. >>> > >>>> > >> > >>> > >>>> > >> > Going by the explanation where the master/slave nodes have >>> > >>>> certain >>> > >>>> > >> > predefined tasks to perform is perfectly fine. >>> > >>>> > >> > >>> > >>>> > >> > I have this scenario, >>> > >>>> > >> > >>> > >>>> > >> > I am using HAProxy as my load balancer and configured 3 >>> nodes >>> > in >>> > >>>> the >>> > >>>> > >> > cluster. >>> > >>>> > >> > >>> > >>>> > >> > Node1 - Active >>> > >>>> > >> > Node2 - Active >>> > >>>> > >> > Node3 - Backup >>> > >>>> > >> > >>> > >>>> > >> > Load balancing algorithm: RoundRobin >>> > >>>> > >> > >>> > >>>> > >> > A Backup node (Node3) is one which the load balancer will >>> not >>> > >>>> route >>> > >>>> > >> > requests to, until one of the Active node i.e either Node1 >>> or >>> > >>>> Node2 >>> > >>>> > has >>> > >>>> > >> > gone down. >>> > >>>> > >> > >>> > >>>> > >> > All these 3 nodes are also part of the hazelcast cluster as >>> > well. >>> > >>>> > >> > >>> > >>>> > >> > In the hazelcast cluster, assume Node1 is elected as the >>> > >>>> leader/master >>> > >>>> > >> and >>> > >>>> > >> > Node2,Node3 as slaves. >>> > >>>> > >> > >>> > >>>> > >> > I initiate the deploy operation on the DeploymentWebService >>> > >>>> which the >>> > >>>> > >> load >>> > >>>> > >> > balancer routes it to one of the Active nodes in the >>> cluster, >>> > >>>> lets say >>> > >>>> > >> it's >>> > >>>> > >> > the Node1. Since Node1 is also the master in the hazelcast >>> > >>>> cluster, >>> > >>>> > >> > deployment is a success. >>> > >>>> > >> > >>> > >>>> > >> > I initiate another deploy operation on the >>> DeploymentWebService >>> > >>>> which >>> > >>>> > >> the >>> > >>>> > >> > load balancer routes it to the next active node which is >>> Node2. >>> > >>>> Since >>> > >>>> > >> Node2 >>> > >>>> > >> > is a slave in the Hazelcast cluster, What happens to the >>> > >>>> deployment? >>> > >>>> > >> > >>> > >>>> > >> > regards, >>> > >>>> > >> > sathwik >>> > >>>> > >> > >>> > >>>> > >> > On Wed, May 27, 2015 at 10:55 PM, sudharma subasinghe < >>> > >>>> > >> > suba...@cse.mrt.ac.lk >>> > >>>> > >> > > wrote: >>> > >>>> > >> > >>> > >>>> > >> > > Hi, >>> > >>>> > >> > > >>> > >>>> > >> > > I will explain my approach as much as possible. The >>> oldest >>> > >>>> node in >>> > >>>> > the >>> > >>>> > >> > > hazelcast cluster is elected as the master node. In the >>> > >>>> failure of >>> > >>>> > the >>> > >>>> > >> > > master node, next oldest node will be elected as the >>> master >>> > >>>> node. >>> > >>>> > This >>> > >>>> > >> > > master-slave configuration is just for deployment. When >>> the >>> > >>>> > hazelcast >>> > >>>> > >> > > cluster elected the master node, that node becomes a >>> master >>> > >>>> node for >>> > >>>> > >> > > deploying process. So it will do the deploying >>> artifacts. If >>> > >>>> you >>> > >>>> > want >>> > >>>> > >> to >>> > >>>> > >> > > get the idea of electing master node please refer the >>> code >>> > >>>> which I >>> > >>>> > >> have >>> > >>>> > >> > > located in the github. ( >>> > >>>> > >> > > https://github.com/Subasinghe/ode/tree/ode_clustering) >>> > >>>> > >> > > >>> > >>>> > >> > > I identified separated actions which should be followed >>> by >>> > the >>> > >>>> > master >>> > >>>> > >> and >>> > >>>> > >> > > salve nodes. >>> > >>>> > >> > > Actions which are followed by master node only >>> > >>>> > >> > > 1) create deployment unit >>> > >>>> > >> > > 2) set the version nu to deployment unit >>> > >>>> > >> > > 3) compile deployment unit >>> > >>>> > >> > > 4) scan deployment unit >>> > >>>> > >> > > 5) retire previous versions >>> > >>>> > >> > > Master node and slave nodes should create _processes >>> which >>> > >>>> stores >>> > >>>> > >> > > ProcessConfImpl >>> > >>>> > >> > > Only master node will write the version nu to database, >>> > create >>> > >>>> > >> .deployed >>> > >>>> > >> > > file >>> > >>>> > >> > > >>> > >>>> > >> > > So there are some actions which should be followed only >>> by >>> > >>>> master >>> > >>>> > node >>> > >>>> > >> > > while other actions should be followed by all the >>> nodes.The >>> > >>>> idea of >>> > >>>> > >> > having >>> > >>>> > >> > > a master node is deploying artifacts and avoid others >>> from >>> > >>>> writing >>> > >>>> > the >>> > >>>> > >> > > version nu to database. >>> > >>>> > >> > > Whether a node is active or passive, all nodes should do >>> the >>> > >>>> > >> > > deployment.Master >>> > >>>> > >> > > and slaves will follow necessary actions as in above. >>> > >>>> > >> > > >>> > >>>> > >> > > >>> > >>>> > >> > > >>> > >>>> > >> > > >>> > >>>> > >> > > >>> > >>>> > >> > > On 27 May 2015 at 15:49, Sathwik B P < >>> sathwik...@gmail.com> >>> > >>>> wrote: >>> > >>>> > >> > > >>> > >>>> > >> > > > Nandika, >>> > >>>> > >> > > > >>> > >>>> > >> > > > I very well understand what you have put across, but >>> it's >>> > >>>> > secondary >>> > >>>> > >> to >>> > >>>> > >> > me >>> > >>>> > >> > > > now. >>> > >>>> > >> > > > >>> > >>>> > >> > > > Sudharma, >>> > >>>> > >> > > > My primary concern is to understand at a high level the >>> > >>>> deployment >>> > >>>> > >> > > > architecture and how would master-slave configuration >>> fit >>> > >>>> in. Are >>> > >>>> > >> there >>> > >>>> > >> > > any >>> > >>>> > >> > > > restrictions imposed by the in-progress design? >>> > >>>> > >> > > > >>> > >>>> > >> > > > Firstly, how would ODE process deployment work under >>> these >>> > >>>> cluster >>> > >>>> > >> > > > configurations? >>> > >>>> > >> > > > >>> > >>>> > >> > > > Sample Cluster configurations: A load balancer is >>> > >>>> frontending the >>> > >>>> > >> > > servers. >>> > >>>> > >> > > > 1) Cluster consisting of 2 nodes all Active-Active. >>> > >>>> > >> > > > 2) Cluster consisting of 2 nodes Active-Passive. >>> > >>>> > >> > > > 3) Cluster with 2+ nodes with additional nodes either >>> in >>> > >>>> Active or >>> > >>>> > >> > > Passive. >>> > >>>> > >> > > > >>> > >>>> > >> > > > regards, >>> > >>>> > >> > > > sathwik >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > > On Wed, May 27, 2015 at 3:04 PM, Nandika Jayawardana < >>> > >>>> > >> > jayaw...@gmail.com >>> > >>>> > >> > > > >>> > >>>> > >> > > > wrote: >>> > >>>> > >> > > > >>> > >>>> > >> > > > > Hi Sathwik, >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > According to my understanding, in the clustering >>> > scenario, >>> > >>>> the >>> > >>>> > >> master >>> > >>>> > >> > > > node >>> > >>>> > >> > > > > should perform all the deployment actions and the >>> slave >>> > >>>> nodes >>> > >>>> > also >>> > >>>> > >> > need >>> > >>>> > >> > > > to >>> > >>>> > >> > > > > perform some deployment actions. For example, the >>> slave >>> > >>>> nodes >>> > >>>> > also >>> > >>>> > >> > > should >>> > >>>> > >> > > > > handle the process ACTIVATED event so that the >>> process >>> > >>>> > >> configuration >>> > >>>> > >> > is >>> > >>>> > >> > > > > added to the engine and necessary web services are >>> > created >>> > >>>> so >>> > >>>> > that >>> > >>>> > >> > when >>> > >>>> > >> > > > the >>> > >>>> > >> > > > > load balancer send requests to any node in the >>> cluster, >>> > it >>> > >>>> is >>> > >>>> > >> ready >>> > >>>> > >> > to >>> > >>>> > >> > > > > accept those requests. >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > Regards >>> > >>>> > >> > > > > Nandika >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > On Wed, May 27, 2015 at 12:30 PM, Sathwik B P < >>> > >>>> > >> sathwik...@gmail.com> >>> > >>>> > >> > > > > wrote: >>> > >>>> > >> > > > > >>> > >>>> > >> > > > > > Sudharma, >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > > Where are you going to configure the >>> master-slaves, is >>> > >>>> it in >>> > >>>> > the >>> > >>>> > >> > web >>> > >>>> > >> > > > > > application level or at the load balancer? >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > > regards, >>> > >>>> > >> > > > > > sathwik >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > > On Tue, May 26, 2015 at 7:42 PM, sudharma >>> subasinghe < >>> > >>>> > >> > > > > > suba...@cse.mrt.ac.lk> >>> > >>>> > >> > > > > > wrote: >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > > > Hi Tammo, >>> > >>>> > >> > > > > > > >>> > >>>> > >> > > > > > > Can you suggest the best method from these to >>> > >>>> implement? As >>> > >>>> > >> > first I >>> > >>>> > >> > > > > > > suggested the master-slaves scenario I think it >>> is >>> > >>>> easy to >>> > >>>> > >> > > implement >>> > >>>> > >> > > > > than >>> > >>>> > >> > > > > > > distributed lock scenario. However if you can >>> suggest >>> > >>>> one >>> > >>>> > from >>> > >>>> > >> > > these >>> > >>>> > >> > > > > two, >>> > >>>> > >> > > > > > > then I can think about it. >>> > >>>> > >> > > > > > > >>> > >>>> > >> > > > > > > Thank you >>> > >>>> > >> > > > > > > >>> > >>>> > >> > > > > > > On 21 May 2015 at 12:40, Sathwik B P < >>> > >>>> sathwik...@gmail.com> >>> > >>>> > >> > wrote: >>> > >>>> > >> > > > > > > >>> > >>>> > >> > > > > > > > With respect to the hotdeployment, >>> > >>>> > >> > > > > > > > >>> > >>>> > >> > > > > > > > We can drop the deployment archive onto the >>> > >>>> deployment >>> > >>>> > >> folder. >>> > >>>> > >> > > > Since >>> > >>>> > >> > > > > > the >>> > >>>> > >> > > > > > > > DeploymentPoller are acquiring the distributed >>> lock >>> > >>>> for >>> > >>>> > the >>> > >>>> > >> > > > > > > DeploymentUnit, >>> > >>>> > >> > > > > > > > only one of the nodes will get the lock and >>> > initiate >>> > >>>> the >>> > >>>> > >> > > > deployment. >>> > >>>> > >> > > > > > > > DeploymentPollers on other nodes will fail in >>> > >>>> acquiring >>> > >>>> > the >>> > >>>> > >> > lock >>> > >>>> > >> > > > and >>> > >>>> > >> > > > > > > hence >>> > >>>> > >> > > > > > > > will silently ignore it. >>> > >>>> > >> > > > > > > > >>> > >>>> > >> > > > > > > > On Thu, May 21, 2015 at 12:30 PM, Sathwik B P < >>> > >>>> > >> > > > sathwik...@gmail.com> >>> > >>>> > >> > > > > > > > wrote: >>> > >>>> > >> > > > > > > > >>> > >>>> > >> > > > > > > > > Hi Tammo, >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > The distributed lock acquisition on the >>> > >>>> DeploymentUnit >>> > >>>> > >> should >>> > >>>> > >> > > be >>> > >>>> > >> > > > > > added >>> > >>>> > >> > > > > > > to >>> > >>>> > >> > > > > > > > > both DeploymentWebService and >>> DeploymentPoller. >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > When a deployment operation is initiated >>> through >>> > >>>> the >>> > >>>> > >> > > > > > > > DeploymentWebService, >>> > >>>> > >> > > > > > > > > The load balancer routes it to any of the >>> > available >>> > >>>> > nodes. >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > On the routed node, the DeploymentWebService >>> > >>>> acquires >>> > >>>> > the >>> > >>>> > >> > > > > Distributed >>> > >>>> > >> > > > > > > > > lock. On the remaining nodes the >>> DeploymentPoller >>> > >>>> will >>> > >>>> > >> try to >>> > >>>> > >> > > > > acquire >>> > >>>> > >> > > > > > > the >>> > >>>> > >> > > > > > > > > distributed lock and will not get it and >>> hence >>> > will >>> > >>>> > >> silently >>> > >>>> > >> > > > ignore >>> > >>>> > >> > > > > > it. >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > Once the routed node completes the >>> deployment, it >>> > >>>> will >>> > >>>> > >> > release >>> > >>>> > >> > > > the >>> > >>>> > >> > > > > > > lock. >>> > >>>> > >> > > > > > > > > This way we don't have to stall the >>> > >>>> DeploymentPoller in >>> > >>>> > >> other >>> > >>>> > >> > > > > nodes. >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > Does it answer the concerns? >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > Now, if we give the responsibility of >>> identifying >>> > >>>> the >>> > >>>> > >> master >>> > >>>> > >> > > node >>> > >>>> > >> > > > > to >>> > >>>> > >> > > > > > > the >>> > >>>> > >> > > > > > > > > hazelcast, how do we plan to intimate the >>> load >>> > >>>> balancer >>> > >>>> > to >>> > >>>> > >> > > change >>> > >>>> > >> > > > > > it's >>> > >>>> > >> > > > > > > > > configuration about the master node? >>> > >>>> > >> > > > > > > > > Assuming there are 3 nodes in the cluster, >>> > >>>> > >> > > > > > > > > node1 -master >>> > >>>> > >> > > > > > > > > node2 - slave >>> > >>>> > >> > > > > > > > > node3 - slave >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > Node1 goes down, the LB will promote Node2 as >>> > >>>> master >>> > >>>> > node, >>> > >>>> > >> > but >>> > >>>> > >> > > > > > > hazelcast >>> > >>>> > >> > > > > > > > > might promote Node3 as master node. They are >>> out >>> > of >>> > >>>> > sync. >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > Is this argument valid? >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > regards, >>> > >>>> > >> > > > > > > > > sathwik >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > On Wed, May 20, 2015 at 1:51 PM, Tammo van >>> > Lessen < >>> > >>>> > >> > > > > > > tvanles...@gmail.com> >>> > >>>> > >> > > > > > > > > wrote: >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > >> Hi Sudharma, >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> what do you expect from the "other nodes >>> > >>>> deployment"? >>> > >>>> > >> > > > Compilation >>> > >>>> > >> > > > > is >>> > >>>> > >> > > > > > > not >>> > >>>> > >> > > > > > > > >> needed since the CBP file is written to the >>> > >>>> (shared) >>> > >>>> > FS. >>> > >>>> > >> > > > > > Registration >>> > >>>> > >> > > > > > > is >>> > >>>> > >> > > > > > > > >> also not needed, since it is done via the >>> shared >>> > >>>> > >> database. >>> > >>>> > >> > So >>> > >>>> > >> > > > the >>> > >>>> > >> > > > > > only >>> > >>>> > >> > > > > > > > >> thing that might be needed is to tell the >>> engine >>> > >>>> that >>> > >>>> > >> there >>> > >>>> > >> > > is a >>> > >>>> > >> > > > > new >>> > >>>> > >> > > > > > > > >> deployment. I'd need to check that. If this >>> is >>> > >>>> needed, >>> > >>>> > I >>> > >>>> > >> > > revert >>> > >>>> > >> > > > my >>> > >>>> > >> > > > > > > last >>> > >>>> > >> > > > > > > > >> statement, then it is perhaps better to just >>> > send >>> > >>>> an >>> > >>>> > >> event >>> > >>>> > >> > > over >>> > >>>> > >> > > > > > > > Hazelcast >>> > >>>> > >> > > > > > > > >> to all nodes that the deployment has >>> changed. >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> Best, >>> > >>>> > >> > > > > > > > >> Tammo >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> On Wed, May 20, 2015 at 10:13 AM, sudharma >>> > >>>> subasinghe < >>> > >>>> > >> > > > > > > > >> suba...@cse.mrt.ac.lk >>> > >>>> > >> > > > > > > > >> > wrote: >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> > Hi Tammo, >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > The master node writes meta data. But >>> runtime >>> > >>>> > >> information >>> > >>>> > >> > > must >>> > >>>> > >> > > > > be >>> > >>>> > >> > > > > > > > >> available >>> > >>>> > >> > > > > > > > >> > in all nodes.Since the folder is shared, >>> all >>> > >>>> nodes >>> > >>>> > will >>> > >>>> > >> > see >>> > >>>> > >> > > > the >>> > >>>> > >> > > > > > > > >> > availability of a new process. My idea is >>> for >>> > >>>> master >>> > >>>> > >> node >>> > >>>> > >> > to >>> > >>>> > >> > > > > write >>> > >>>> > >> > > > > > > the >>> > >>>> > >> > > > > > > > >> meta >>> > >>>> > >> > > > > > > > >> > data and other nodes to just read the meta >>> > data >>> > >>>> and >>> > >>>> > >> load >>> > >>>> > >> > > > > > process.So >>> > >>>> > >> > > > > > > we >>> > >>>> > >> > > > > > > > >> need >>> > >>>> > >> > > > > > > > >> > a small delay between master node >>> deployment >>> > and >>> > >>>> > other >>> > >>>> > >> > nodes >>> > >>>> > >> > > > > > > > deployment. >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > Is there anyway to set the delay between >>> > master >>> > >>>> node >>> > >>>> > >> and >>> > >>>> > >> > > > slaves >>> > >>>> > >> > > > > > > until >>> > >>>> > >> > > > > > > > >> > master node finish the deployment? >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > Thank you >>> > >>>> > >> > > > > > > > >> > Sudharma >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > On 20 May 2015 at 13:01, Tammo van Lessen >>> < >>> > >>>> > >> > > > tvanles...@gmail.com >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > > > > wrote: >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> > > Hi Sathwik, >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > On Wed, May 20, 2015 at 6:40 AM, >>> Sathwik B >>> > P < >>> > >>>> > >> > > > > > > sathwik...@gmail.com> >>> > >>>> > >> > > > > > > > >> > wrote: >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > > Sudharma/Tammo, >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > > 1) How do we plan to decide which is >>> the >>> > >>>> master >>> > >>>> > >> node >>> > >>>> > >> > in >>> > >>>> > >> > > > the >>> > >>>> > >> > > > > > > > cluster? >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > I think the easiest approach is to >>> always >>> > >>>> elect the >>> > >>>> > >> > oldest >>> > >>>> > >> > > > > node >>> > >>>> > >> > > > > > in >>> > >>>> > >> > > > > > > > the >>> > >>>> > >> > > > > > > > >> > > cluster to be the master. AFAIK >>> Hazelcast >>> > can >>> > >>>> > easily >>> > >>>> > >> > asked >>> > >>>> > >> > > > for >>> > >>>> > >> > > > > > > this >>> > >>>> > >> > > > > > > > >> > > information. >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > > 2) Don't we need to stall the >>> Deployment >>> > >>>> Pollers >>> > >>>> > in >>> > >>>> > >> > the >>> > >>>> > >> > > > > slave >>> > >>>> > >> > > > > > > > nodes? >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > Absolutely. >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > Suggestion: >>> > >>>> > >> > > > > > > > >> > > > I am not sure whether do we need >>> > >>>> Master-SLaves. >>> > >>>> > Why >>> > >>>> > >> > not >>> > >>>> > >> > > > give >>> > >>>> > >> > > > > > > every >>> > >>>> > >> > > > > > > > >> node >>> > >>>> > >> > > > > > > > >> > > in >>> > >>>> > >> > > > > > > > >> > > > the cluster the same status >>> > (Active-Active). >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > > When a new deployment is made, the >>> load >>> > >>>> balancer >>> > >>>> > >> can >>> > >>>> > >> > > push >>> > >>>> > >> > > > it >>> > >>>> > >> > > > > > to >>> > >>>> > >> > > > > > > > any >>> > >>>> > >> > > > > > > > >> of >>> > >>>> > >> > > > > > > > >> > > the >>> > >>>> > >> > > > > > > > >> > > > available nodes. That node will >>> probably >>> > >>>> acquire >>> > >>>> > a >>> > >>>> > >> > > > > distributed >>> > >>>> > >> > > > > > > > lock >>> > >>>> > >> > > > > > > > >> on >>> > >>>> > >> > > > > > > > >> > > the >>> > >>>> > >> > > > > > > > >> > > > deployment unit and acts as master for >>> > that >>> > >>>> > >> > deployment. >>> > >>>> > >> > > > This >>> > >>>> > >> > > > > > > > ensures >>> > >>>> > >> > > > > > > > >> > > > optimum usage of the cluster nodes. >>> > >>>> Probably no >>> > >>>> > >> static >>> > >>>> > >> > > > > > > > >> configuration of >>> > >>>> > >> > > > > > > > >> > > > Master-Slave in the load balancer nor >>> in >>> > the >>> > >>>> > >> > hazelcast. >>> > >>>> > >> > > > > > > > >> > > > >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > But this would not allow to have the >>> > >>>> hotdeployment >>> > >>>> > >> via >>> > >>>> > >> > > > > > filesystem >>> > >>>> > >> > > > > > > > >> still >>> > >>>> > >> > > > > > > > >> > > enabled, right? >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > Best, >>> > >>>> > >> > > > > > > > >> > > Tammo >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > > -- >>> > >>>> > >> > > > > > > > >> > > Tammo van Lessen - http://www.taval.de >>> > >>>> > >> > > > > > > > >> > > >>> > >>>> > >> > > > > > > > >> > >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > >> -- >>> > >>>> > >> > > > > > > > >> Tammo van Lessen - http://www.taval.de >>> > >>>> > >> > > > > > > > >> >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > > >>> > >>>> > >> > > > > > > > >>> > >>>> > >> > > > > > > >>> > >>>> > >> > > > > > >>> > >>>> > >> > > > > >>> > >>>> > >> > > > >>> > >>>> > >> > > >>> > >>>> > >> > >>> > >>>> > >> >>> > >>>> > > >>> > >>>> > > >>> > >>>> > >>> > >>>> >>> > >>> >>> > >>> >>> > >> >>> > > >>> > >>> >> >> >