Hi Thanuja, Thanks for the update. It seems Stratos Jenkins build has returned to normal with your last commit [1]. So could this duplicate call to sendMemberActivatedEvent() method have been the culprit all this time?
@Reka: Autoscaler is listening to MemberActivated event and executing this logic at [2] for each event. Will it cause an issue if it is executed multiple times? I cannot think of any other place it could have caused any issue. Even TopologyHandler class used by integration tests do not listen to this event. [1] https://builds.apache.org/job/Stratos-41x-Commit-Build/87/ [2] https://github.com/apache/stratos/blob/stratos-4.1.x/components/org.apache.stratos.autoscaler/src/main/java/org/apache/stratos/autoscaler/status/processor/cluster/ClusterStatusProcessorChain.java#L49 Thanks. On Mon, Oct 12, 2015 at 9:28 PM, Thanuja Uruththirakodeeswaran < thanu...@wso2.com> wrote: > Hi All, > > I did several builds with latest changes. Each local build triggered > successfully with no test failures. But there are RuntimeExceptions. > I tried following scenarios > 1. Reverted metering service changes as in commit [1] - build success > 2. Build stratos 4.1.x branch with latest commit [2] - build success > 3. Build stratos 4.1.x branch without latest commit [2] - tests failed > > For the 3rd scenario, I got test failures as below and I've attached part > of test log here: > > > Tests run: 21, Failures: 4, Errors: 0, Skipped: 0, Time elapsed: 1,499.716 > sec <<< FAILURE! - in TestSuite > testTerminationBehavior(org.apache.stratos.integration.tests.application.GroupStartupOrderTestCase) > Time elapsed: 312.479 sec <<< FAILURE! > java.lang.AssertionError: > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:33) > at > org.apache.stratos.integration.tests.application.GroupStartupOrderTestCase.assertCreationOfNodes(GroupStartupOrderTestCase.java:339) > at > org.apache.stratos.integration.tests.application.GroupStartupOrderTestCase.testTerminationBehavior(GroupStartupOrderTestCase.java:149) > > testApplication(org.apache.stratos.integration.tests.application.SampleApplicationStartupTestCase) > Time elapsed: 0.351 sec <<< FAILURE! > java.lang.AssertionError: > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:41) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:49) > at > org.apache.stratos.integration.tests.application.SampleApplicationStartupTestCase.testApplication(SampleApplicationStartupTestCase.java:105) > > testApplicationList(org.apache.stratos.integration.tests.application.SampleApplicationsTestCase) > Time elapsed: 1.032 sec <<< FAILURE! > java.lang.AssertionError: > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:41) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:49) > at > org.apache.stratos.integration.tests.application.SampleApplicationsTestCase.testApplicationList(SampleApplicationsTestCase.java:350) > > testApplicationBusting(org.apache.stratos.integration.tests.application.ApplicationBurstingTestCase) > Time elapsed: 32.926 sec <<< FAILURE! > java.lang.AssertionError: > at org.testng.AssertJUnit.fail(AssertJUnit.java:59) > at org.testng.AssertJUnit.assertTrue(AssertJUnit.java:24) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:41) > at org.testng.AssertJUnit.assertFalse(AssertJUnit.java:49) > at > org.apache.stratos.integration.tests.application.ApplicationBurstingTestCase.testApplicationBusting(ApplicationBurstingTestCase.java:138) > > > Results : > > Failed tests: > ApplicationBurstingTestCase.testApplicationBusting:138 > > GroupStartupOrderTestCase.testTerminationBehavior:149->assertCreationOfNodes:339 > SampleApplicationStartupTestCase.testApplication:105 > SampleApplicationsTestCase.testApplicationList:350 > > > > Tests run: 21, Failures: 4, Errors: 0, Skipped: 0 > > [INFO] > ------------------------------------------------------------------------ > [INFO] Reactor Summary: > [INFO] > [INFO] Apache Stratos - Integration Aggregation Module ... SUCCESS [1.360s] > [INFO] Apache Stratos - Integration Common Module ........ SUCCESS [6.356s] > [INFO] Apache Stratos - Integration Tests ................ FAILURE > [25:05.695s] > [INFO] > ------------------------------------------------------------------------ > > I continuously ran more integration test on above scenarios and if I get > any different result, I'll update in this thread. > > [1]. > https://github.com/Thanu/stratos/commit/631d1a1cc71ade793a6a6c59edabb34bd1594747 > [2]. > https://github.com/apache/stratos/commit/4bc27f29cb84d86fdaac554ca5a426cb0eb3776f > > Thanks. > > On Sun, Oct 11, 2015 at 6:29 PM, Gayan Gunarathne <gay...@wso2.com> wrote: > >> Seems even application is terminated, some of the group monitors are >> still running. Mostly in the nested group scenario. Those group monitors >> looking for the application monitor that didn't exist at that time.So I >> think we need to recheck the logic of group monitor termination logic in >> the application un-deployment. Will check further on this. >> >> 2015-10-11 18:15:07 ERROR >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:73 - >> Exception in thread "pool-39-thread-36" java.lang.NullPointerException >> 2015-10-11 18:15:07 ERROR >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:73 - >> at >> org.apache.stratos.autoscaler.monitor.component.GroupMonitor$1.run(GroupMonitor.java:163) >> 2015-10-11 18:15:07 ERROR >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:73 - >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> 2015-10-11 18:15:07 ERROR >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:73 - >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> 2015-10-11 18:15:07 ERROR >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:73 - >> at >> java.lang.Thread.run(Thread.java:744) >> 2015-10-11 18:15:07 INFO >> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >> [2015-10-11 18:15:07,285] DEBUG >> {org.apache.stratos.autoscaler.monitor.component.GroupMonitor} - Group >> monitor is running: [group] my-group6-group-startup-order-test >> >> Thanks, >> Gayan >> >> On Sat, Oct 10, 2015 at 3:35 PM, Akila Ravihansa Perera < >> raviha...@wso2.com> wrote: >> >>> This is occurring frequently even in local build. I think this is a >>> blocker for 4.1.4 release. >>> >>> On Fri, Oct 9, 2015 at 9:20 PM, Imesh Gunaratne <im...@apache.org> >>> wrote: >>> >>>> Thanks Reka! It is important to identify the root cause of this issue. >>>> I'm also looking in. >>>> >>>> On Thu, Oct 8, 2015 at 6:13 PM, Reka Thirunavukkarasu <r...@wso2.com> >>>> wrote: >>>> >>>>> Hi >>>>> >>>>> After going through the logs of integration, please find the break >>>>> down as below. For my-group6-group-tom2-group-startup-order-test, it >>>>> seems that the Adder got triggered twice or somehow monitor creation got >>>>> triggered twice which cause the memory to inconsistent state. Hence the >>>>> particular GroupMonitor got removed somehow. So that the application >>>>> hierarchy was incomplete and application was unable to become active. I'm >>>>> still investigating on it to find out the root cause. >>>>> >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,504] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} - >>>>> Monitor scheduled: [type] cluster [component] >>>>> group-startup-order-test.my-group-tom2-tomcat2-group-startup-order-test.tomcat2-group-startup-order-test.domain >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,504] DEBUG >>>>> {org.apache.stratos.autoscaler.monitor.cluster.ClusterMonitor} - >>>>> ClusterMonitor task interval set to : [application-id] null [cluster] >>>>> group-startup-order-test.my-group7-tomcat-group-startup-order-test.tomcat-group-startup-order-test.domain >>>>> [monitor-interval] 90000 >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,504] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} - >>>>> Starting monitor: [type] cluster [component] >>>>> group-startup-order-test.my-group-tom2-tomcat2-group-startup-order-test.tomcat2-group-startup-order-test.domain >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,504] ERROR >>>>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} >>>>> - An >>>>> error occurred while starting monitor: [type] group [component] >>>>> my-group6-group-tom2-group-startup-order-test >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> java.lang.NullPointerException >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.Monitor.addInstance(Monitor.java:149) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.component.GroupMonitor.createGroupInstanceAndAddToMonitor(GroupMonitor.java:816) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.component.GroupMonitor.createInstanceAndStartDependencyAtStartup(GroupMonitor.java:898) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getGroupMonitor(MonitorFactory.java:161) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getMonitor(MonitorFactory.java:81) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentComponentMonitor.java:1001) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at >>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> at java.lang.Thread.run(Thread.java:745) >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] DEBUG >>>>> {org.apache.stratos.autoscaler.monitor.component.GroupMonitor} - Group >>>>> monitor is running: [group] my-group6-group-tom2-group-startup-order-test >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,504] DEBUG >>>>> {org.apache.stratos.autoscaler.context.partition.ParentLevelPartitionContext} >>>>> - Instance expiry time is set to: 900000 >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] DEBUG >>>>> {org.apache.stratos.autoscaler.context.partition.ParentLevelPartitionContext} >>>>> - Instance obsolete expiry time is set to: 86400000 >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] DEBUG >>>>> {org.apache.stratos.autoscaler.context.partition.ParentLevelPartitionContext} >>>>> - Instance pending termination expiry time is set to: 1800000 >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.GroupMonitor} - >>>>> [Partition] partition-1has been added for the [Group] >>>>> my-group9-group-startup-order-test >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} - >>>>> Monitor started successfully: [type] group [component] >>>>> my-group6-group-tom2-group-startup-order-test [dependents] none >>>>> [startup-time] 0 seconds >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.GroupMonitor} - Group >>>>> [Instance context] group-startup-order-test-1 has been added to [Group] >>>>> my-group9-group-startup-order-test >>>>> 2015-10-08 13:49:52 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:49:52,505] INFO >>>>> {org.apache.stratos.autoscaler.monitor.component.GroupMonitor} - Starting >>>>> the dependencies for the [Group] my-group9-group-startup-order-test >>>>> >>>>> -------------------------------------- >>>>> >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusProcessor} >>>>> - Checking the status of cluster >>>>> group-startup-order-test.my-group-tom2-tomcat3-group-startup-order-test.tomcat3-group-startup-order-test.domain >>>>> instance status is: Active >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusProcessor} >>>>> - Checking the status of cluster >>>>> group-startup-order-test.my-group-tom2-tomcat2-group-startup-order-test.tomcat2-group-startup-order-test.domain >>>>> instance status is: Active >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyManager} - >>>>> Write lock released >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] INFO >>>>> {org.apache.stratos.autoscaler.status.processor.group.GroupStatusActiveProcessor} >>>>> - Sending group instance active for [group] >>>>> my-group6-group-tom2-group-startup-order-test [instance] >>>>> group-startup-order-test-1 >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.autoscaler.applications.topic.ApplicationBuilder} - >>>>> Handling group activation for the [group-id]: >>>>> my-group6-group-tom2-group-startup-order-test in the [application-id] >>>>> group-startup-order-test [instance] group-startup-order-test-1 >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyManager} - >>>>> Write lock acquired >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] WARN >>>>> {org.apache.stratos.autoscaler.applications.topic.ApplicationBuilder} - >>>>> Group monitor cannot be found: [group-id] >>>>> my-group6-group-tom2-group-startup-order-test [application-id] >>>>> group-startup-order-test >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] INFO >>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyBuilder} - >>>>> Cluster activated adding status started for >>>>> group-startup-order-test.my-group-tom2-tomcat3-group-startup-order-test.tomcat3-group-startup-order-test.domain >>>>> 2015-10-08 13:51:19 INFO >>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 - >>>>> [2015-10-08 13:51:19,100] DEBUG >>>>> {org.apache.stratos.cloud.controller.messaging.topology.TopologyManager} - >>>>> Updating topology >>>>> >>>>> Thanks, >>>>> Reka >>>>> >>>>> >>>>> On Thu, Oct 8, 2015 at 4:47 PM, Reka Thirunavukkarasu <r...@wso2.com> >>>>> wrote: >>>>> >>>>>> Hi Akila >>>>>> >>>>>> On Thu, Oct 8, 2015 at 2:51 PM, Akila Ravihansa Perera < >>>>>> raviha...@wso2.com> wrote: >>>>>> >>>>>>> Hi Thanuja, >>>>>>> >>>>>>> I was referring to the build fail in [1]. Please see line starting >>>>>>> from: >>>>>>> >>>>>>> 2015-10-02 07:28:32 INFO >>>>>>> >>>>>>> {org.wso2.carbon.automation.extensions.servers.utils.ServerLogReader}:70 >>>>>>> - >>>>>>> [2015-10-02 07:28:32,834] ERROR >>>>>>> {org.apache.stratos.autoscaler.rule.RuleTasksDelegator} - Cannot >>>>>>> terminate >>>>>>> instance >>>>>>> >>>>>>> I think this is a separate issue that occurs very rarely. As you >>>>>>> noticed, NPE in GroupMonitor occurs when undeploying applications. The >>>>>>> problem is group monitors are retrieving the status from other group >>>>>>> monitor objects. >>>>>>> >>>>>>> @Reka: shall we change that logic to read the application status >>>>>>> from topology or ApplicationHolder instead of reading from monitors? >>>>>>> >>>>>> >>>>>> We can't read the Application status from Topology as Application >>>>>> will not contain any status rather only application instance will contain >>>>>> the status. By accessing application instance, we can't identify whether >>>>>> undeployment or force undeployment is in-progress. Hence we will have to >>>>>> maintain these flags to ApplicationMonitor to identity the undeployment >>>>>> or >>>>>> force undeployment. All other status are actually referring from the >>>>>> Topology. >>>>>> >>>>>> Thanks, >>>>>> Reka >>>>>> >>>>>>> >>>>>>> [1] >>>>>>> https://builds.apache.org/job/Stratos-41x-Commit-Build/68/consoleFull >>>>>>> >>>>>>> Thanks. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Reka Thirunavukkarasu >>>>>> Senior Software Engineer, >>>>>> WSO2, Inc.:http://wso2.com, >>>>>> Mobile: +94776442007 >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Reka Thirunavukkarasu >>>>> Senior Software Engineer, >>>>> WSO2, Inc.:http://wso2.com, >>>>> Mobile: +94776442007 >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Imesh Gunaratne >>>> >>>> Senior Technical Lead, WSO2 >>>> Committer & PMC Member, Apache Stratos >>>> >>> >>> >>> >>> -- >>> Akila Ravihansa Perera >>> WSO2 Inc.; http://wso2.com/ >>> >>> Blog: http://ravihansa3000.blogspot.com >>> >> >> >> >> -- >> >> Gayan Gunarathne >> Technical Lead, WSO2 Inc. (http://wso2.com) >> Committer & PMC Member, Apache Stratos >> email : gay...@wso2.com | mobile : +94 775030545 <%2B94%20766819985> >> >> >> > > > > -- > Thanuja Uruththirakodeeswaran > Software Engineer > WSO2 Inc.;http://wso2.com > lean.enterprise.middleware > > mobile: +94 774363167 > -- Akila Ravihansa Perera WSO2 Inc.; http://wso2.com/ Blog: http://ravihansa3000.blogspot.com