[ 
https://issues.apache.org/jira/browse/GEODE-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruce Schuchardt updated GEODE-950:
-----------------------------------
    Description: 
This test starts locators simultaneously and both are configured to know about 
the other.  In the run below two locators created their own membership views, 
forming a split-brain at start up time instead of forming a single distributed 
system.

Host name: w2-2013-lin-12
OS name: Linux
Architecture: amd64
OS version: 3.10.0-229.el7.x86_64
Java version: 1.8.0_66
Java vm name: Java HotSpot(TM) 64-Bit Server VM
Java vendor: Oracle Corporation
Java home: /export/gcm/where/jdk/1.8.0_66/x86_64.linux/jre

  #####################################################
  
  GemFire Version 9.0.0-SNAPSHOT
  Source Date: 2016-02-03 16:09:18 -0800
  Source Revision: 3f7070f117dbd8f2e5fb436b6aed3469e9fca673
  Source Repository: develop
  
  Build Id: bruces 020416
  Build Date: 2016-02-04 16:02:44 -0800
  Build Version: 9.0.0-SNAPSHOT bruces 020416 2016-02-04 16:02:44 -0800 javac 
1.8.0_66
  Build JDK: Java 1.8.0_66
  Build Platform: Linux 2.6.32-122.el6.x86_64 amd64
  
  #####################################################


Test was run from 
/export/frodo2/users/bruce/devel/gfasf/closed/gemfire-test/build/resources/test/newWan/discovery/newWanDiscovery.bt

Test:
parReg/newWan/parallel/discovery/wanAdminLocatorsPeerHAP2P.conf
   locatorHostsPerSite=4
   locatorThreadsPerVM=1
   locatorVMsPerHost=1
   maxOps=300
   peerHostsPerSite=2
   peerMem=256m
   peerThreadsPerVM=10
   peerVMsPerHost=2
   redundantCopies=1
   resultWaitSec=600
   wanSites=3

Run with local.conf:

hydra.HostPrms-hostNames = w2-2013-lin-12 w1-gst-dev03;

//randomSeed extracted from test:
hydra.Prms-randomSeed=1454836695339;

*** Test failed with this error:
CLIENT vm_17_thr_64_peer_2_1_w1-gst-dev03_3365
INITTASK[2] newWan.WANTest.HydraTask_initPeerTask
HANG a client exceeded max result wait sec: 600

*** Last client logging by hung thread
[info 2016/02/07 01:30:48.650 PST <vm_17_thr_64_peer_2_1_w1-gst-dev03_3365> 
tid=0x1e] Configured disk store factory: 
com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl@16cf1ca8

*** Test declared hung 595996 ms after last client logging
[severe 2016/02/07 01:40:44.646 PST <vm_17_thr_68_peer_2_1_w1-gst-dev03_2152 
Dynamic Client VM Stopper> tid=0x274] Result for 
vm_17_thr_64_peer_2_1_w1-gst-dev03_3365: INITTASK[2] 
newWan.WANTest.HydraTask_initPeerTask: HANG a client exceeded max result wait 
sec: 600

*** Hung thread
"vm_17_thr_64_peer_2_1_w1-gst-dev03_3365" #30 daemon prio=5 os_prio=0 
tid=0x00007f0ca0026000 nid=0xdd3 waiting on condition [0x00007f0cafffd000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000f7429b60> (a 
java.util.concurrent.CountDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
        at 
com.gemstone.gemfire.internal.cache.BucketPersistenceAdvisor.waitForPrimaryPersistentRecovery(BucketPersistenceAdvisor.java:363)
        at 
com.gemstone.gemfire.internal.cache.ProxyBucketRegion.waitForPrimaryPersistentRecovery(ProxyBucketRegion.java:633)
        at 
com.gemstone.gemfire.internal.cache.PRHARedundancyProvider.recoverPersistentBuckets(PRHARedundancyProvider.java:1821)
        at 
com.gemstone.gemfire.internal.cache.PartitionedRegion.initPRInternals(PartitionedRegion.java:1073)
        - locked <0x00000000f567aa10> (a 
com.gemstone.gemfire.internal.cache.PartitionedRegion)
        at 
com.gemstone.gemfire.internal.cache.PartitionedRegion.initialize(PartitionedRegion.java:1193)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3171)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3063)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:3052)
        at hydra.RegionHelper.createRegion(RegionHelper.java:129)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at hydra.RegionHelper.createRegion(RegionHelper.java:93)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at hydra.RegionHelper.createRegion(RegionHelper.java:80)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at newWan.WANTest.initDatastoreRegion(WANTest.java:439)
        at newWan.WANTest.HydraTask_initPeerTask(WANTest.java:797)
        - locked <0x00000000f58842e8> (a java.lang.Class for newWan.WANTest)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at hydra.MethExecutor.execute(MethExecutor.java:198)
        at hydra.MethExecutor.execute(MethExecutor.java:162)
        at hydra.TestTask.execute(TestTask.java:195)
        at hydra.RemoteTestModule$1.run(RemoteTestModule.java:216)

Stack for hung thread vm_17_thr_64_peer_2_1_w1-gst-dev03_3365 was found 3 times 
and was unchanging.


  was:
This test starts locators simultaneously and both are configured to know about 
the other.  In the run below two locators created their own membership views, 
forming a split-brain at start up time instead of forming a single distributed 
system.

Host name: w2-2013-lin-12
OS name: Linux
Architecture: amd64
OS version: 3.10.0-229.el7.x86_64
Java version: 1.8.0_66
Java vm name: Java HotSpot(TM) 64-Bit Server VM
Java vendor: Oracle Corporation
Java home: /export/gcm/where/jdk/1.8.0_66/x86_64.linux/jre

  #####################################################
  
  GemFire Version 9.0.0-SNAPSHOT
  Source Date: 2016-02-03 16:09:18 -0800
  Source Revision: 3f7070f117dbd8f2e5fb436b6aed3469e9fca673
  Source Repository: develop
  
  Build Id: bruces 020416
  Build Date: 2016-02-04 16:02:44 -0800
  Build Version: 9.0.0-SNAPSHOT bruces 020416 2016-02-04 16:02:44 -0800 javac 
1.8.0_66
  Build JDK: Java 1.8.0_66
  Build Platform: Linux 2.6.32-122.el6.x86_64 amd64
  
  #####################################################


Test was run from 
/export/frodo2/users/bruce/devel/gfasf/closed/gemfire-test/build/resources/test/newWan/discovery/newWanDiscovery.bt

Test:
parReg/newWan/parallel/discovery/wanAdminLocatorsPeerHAP2P.conf
   locatorHostsPerSite=4
   locatorThreadsPerVM=1
   locatorVMsPerHost=1
   maxOps=300
   peerHostsPerSite=2
   peerMem=256m
   peerThreadsPerVM=10
   peerVMsPerHost=2
   redundantCopies=1
   resultWaitSec=600
   wanSites=3

Run with local.conf:

hydra.HostPrms-hostNames = w2-2013-lin-12 w1-gst-dev03;

//randomSeed extracted from test:
hydra.Prms-randomSeed=1454836695339;

*** Test failed with this error:
CLIENT vm_17_thr_64_peer_2_1_w1-gst-dev03_3365
INITTASK[2] newWan.WANTest.HydraTask_initPeerTask
HANG a client exceeded max result wait sec: 600

*** Last client logging by hung thread
[info 2016/02/07 01:30:48.650 PST <vm_17_thr_64_peer_2_1_w1-gst-dev03_3365> 
tid=0x1e] Configured disk store factory: 
com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl@16cf1ca8

*** Test declared hung 595996 ms after last client logging
[severe 2016/02/07 01:40:44.646 PST <vm_17_thr_68_peer_2_1_w1-gst-dev03_2152 
Dynamic Client VM Stopper> tid=0x274] Result for 
vm_17_thr_64_peer_2_1_w1-gst-dev03_3365: INITTASK[2] 
newWan.WANTest.HydraTask_initPeerTask: HANG a client exceeded max result wait 
sec: 600

*** Hung thread
"vm_17_thr_64_peer_2_1_w1-gst-dev03_3365" #30 daemon prio=5 os_prio=0 
tid=0x00007f0ca0026000 nid=0xdd3 waiting on condition [0x00007f0cafffd000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000000f7429b60> (a 
java.util.concurrent.CountDownLatch$Sync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
        at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
        at 
com.gemstone.gemfire.internal.cache.BucketPersistenceAdvisor.waitForPrimaryPersistentRecovery(BucketPersistenceAdvisor.java:363)
        at 
com.gemstone.gemfire.internal.cache.ProxyBucketRegion.waitForPrimaryPersistentRecovery(ProxyBucketRegion.java:633)
        at 
com.gemstone.gemfire.internal.cache.PRHARedundancyProvider.recoverPersistentBuckets(PRHARedundancyProvider.java:1821)
        at 
com.gemstone.gemfire.internal.cache.PartitionedRegion.initPRInternals(PartitionedRegion.java:1073)
        - locked <0x00000000f567aa10> (a 
com.gemstone.gemfire.internal.cache.PartitionedRegion)
        at 
com.gemstone.gemfire.internal.cache.PartitionedRegion.initialize(PartitionedRegion.java:1193)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3171)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3063)
        at 
com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:3052)
        at hydra.RegionHelper.createRegion(RegionHelper.java:129)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at hydra.RegionHelper.createRegion(RegionHelper.java:93)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at hydra.RegionHelper.createRegion(RegionHelper.java:80)
        - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
        at newWan.WANTest.initDatastoreRegion(WANTest.java:439)
        at newWan.WANTest.HydraTask_initPeerTask(WANTest.java:797)
        - locked <0x00000000f58842e8> (a java.lang.Class for newWan.WANTest)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at hydra.MethExecutor.execute(MethExecutor.java:198)
        at hydra.MethExecutor.execute(MethExecutor.java:162)
        at hydra.TestTask.execute(TestTask.java:195)
        at hydra.RemoteTestModule$1.run(RemoteTestModule.java:216)

Stack for hung thread vm_17_thr_64_peer_2_1_w1-gst-dev03_3365 was found 3 times 
and was unchanging.

See http://hydradb.gemstone.com/hdb/testresult/920073


> split brain in wanAdminLocatorsPeerHAP2P
> ----------------------------------------
>
>                 Key: GEODE-950
>                 URL: https://issues.apache.org/jira/browse/GEODE-950
>             Project: Geode
>          Issue Type: Bug
>          Components: membership
>            Reporter: Bruce Schuchardt
>
> This test starts locators simultaneously and both are configured to know 
> about the other.  In the run below two locators created their own membership 
> views, forming a split-brain at start up time instead of forming a single 
> distributed system.
> Host name: w2-2013-lin-12
> OS name: Linux
> Architecture: amd64
> OS version: 3.10.0-229.el7.x86_64
> Java version: 1.8.0_66
> Java vm name: Java HotSpot(TM) 64-Bit Server VM
> Java vendor: Oracle Corporation
> Java home: /export/gcm/where/jdk/1.8.0_66/x86_64.linux/jre
>   #####################################################
>   
>   GemFire Version 9.0.0-SNAPSHOT
>   Source Date: 2016-02-03 16:09:18 -0800
>   Source Revision: 3f7070f117dbd8f2e5fb436b6aed3469e9fca673
>   Source Repository: develop
>   
>   Build Id: bruces 020416
>   Build Date: 2016-02-04 16:02:44 -0800
>   Build Version: 9.0.0-SNAPSHOT bruces 020416 2016-02-04 16:02:44 -0800 javac 
> 1.8.0_66
>   Build JDK: Java 1.8.0_66
>   Build Platform: Linux 2.6.32-122.el6.x86_64 amd64
>   
>   #####################################################
> Test was run from 
> /export/frodo2/users/bruce/devel/gfasf/closed/gemfire-test/build/resources/test/newWan/discovery/newWanDiscovery.bt
> Test:
> parReg/newWan/parallel/discovery/wanAdminLocatorsPeerHAP2P.conf
>    locatorHostsPerSite=4
>    locatorThreadsPerVM=1
>    locatorVMsPerHost=1
>    maxOps=300
>    peerHostsPerSite=2
>    peerMem=256m
>    peerThreadsPerVM=10
>    peerVMsPerHost=2
>    redundantCopies=1
>    resultWaitSec=600
>    wanSites=3
> Run with local.conf:
> hydra.HostPrms-hostNames = w2-2013-lin-12 w1-gst-dev03;
> //randomSeed extracted from test:
> hydra.Prms-randomSeed=1454836695339;
> *** Test failed with this error:
> CLIENT vm_17_thr_64_peer_2_1_w1-gst-dev03_3365
> INITTASK[2] newWan.WANTest.HydraTask_initPeerTask
> HANG a client exceeded max result wait sec: 600
> *** Last client logging by hung thread
> [info 2016/02/07 01:30:48.650 PST <vm_17_thr_64_peer_2_1_w1-gst-dev03_3365> 
> tid=0x1e] Configured disk store factory: 
> com.gemstone.gemfire.internal.cache.DiskStoreFactoryImpl@16cf1ca8
> *** Test declared hung 595996 ms after last client logging
> [severe 2016/02/07 01:40:44.646 PST <vm_17_thr_68_peer_2_1_w1-gst-dev03_2152 
> Dynamic Client VM Stopper> tid=0x274] Result for 
> vm_17_thr_64_peer_2_1_w1-gst-dev03_3365: INITTASK[2] 
> newWan.WANTest.HydraTask_initPeerTask: HANG a client exceeded max result wait 
> sec: 600
> *** Hung thread
> "vm_17_thr_64_peer_2_1_w1-gst-dev03_3365" #30 daemon prio=5 os_prio=0 
> tid=0x00007f0ca0026000 nid=0xdd3 waiting on condition [0x00007f0cafffd000]
>    java.lang.Thread.State: WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <0x00000000f7429b60> (a 
> java.util.concurrent.CountDownLatch$Sync)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
>       at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
>       at 
> com.gemstone.gemfire.internal.cache.BucketPersistenceAdvisor.waitForPrimaryPersistentRecovery(BucketPersistenceAdvisor.java:363)
>       at 
> com.gemstone.gemfire.internal.cache.ProxyBucketRegion.waitForPrimaryPersistentRecovery(ProxyBucketRegion.java:633)
>       at 
> com.gemstone.gemfire.internal.cache.PRHARedundancyProvider.recoverPersistentBuckets(PRHARedundancyProvider.java:1821)
>       at 
> com.gemstone.gemfire.internal.cache.PartitionedRegion.initPRInternals(PartitionedRegion.java:1073)
>       - locked <0x00000000f567aa10> (a 
> com.gemstone.gemfire.internal.cache.PartitionedRegion)
>       at 
> com.gemstone.gemfire.internal.cache.PartitionedRegion.initialize(PartitionedRegion.java:1193)
>       at 
> com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3171)
>       at 
> com.gemstone.gemfire.internal.cache.GemFireCacheImpl.basicCreateRegion(GemFireCacheImpl.java:3063)
>       at 
> com.gemstone.gemfire.internal.cache.GemFireCacheImpl.createRegion(GemFireCacheImpl.java:3052)
>       at hydra.RegionHelper.createRegion(RegionHelper.java:129)
>       - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
>       at hydra.RegionHelper.createRegion(RegionHelper.java:93)
>       - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
>       at hydra.RegionHelper.createRegion(RegionHelper.java:80)
>       - locked <0x00000000f68035b0> (a java.lang.Class for hydra.RegionHelper)
>       at newWan.WANTest.initDatastoreRegion(WANTest.java:439)
>       at newWan.WANTest.HydraTask_initPeerTask(WANTest.java:797)
>       - locked <0x00000000f58842e8> (a java.lang.Class for newWan.WANTest)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:497)
>       at hydra.MethExecutor.execute(MethExecutor.java:198)
>       at hydra.MethExecutor.execute(MethExecutor.java:162)
>       at hydra.TestTask.execute(TestTask.java:195)
>       at hydra.RemoteTestModule$1.run(RemoteTestModule.java:216)
> Stack for hung thread vm_17_thr_64_peer_2_1_w1-gst-dev03_3365 was found 3 
> times and was unchanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to