[
https://issues.apache.org/jira/browse/KARAF-1309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexey Bespaly reopened KARAF-1309:
-----------------------------------
TESB-EE-Runtime 5.1.1 SNAPSHOT #365
Apache Cellar 2.2.4 SNAPSHOT as of 8.05.2012
While retesting got the following situation with the 4-th node (after adding a
network interface and starting all containers):
[[ 198] [Active ] [Created ] [ ] [ 80] Apache Karaf :: Cellar
:: Core (2.2.4.SNAPSHOT)
[ 199] [Resolved ] [ ] [ ] [ 80] Apache Karaf :: Cellar ::
Hazelcast (2.2.4.SNAPSHOT)
Hosts: 197
[ 200] [Active ] [Failure ] [ ] [ 80] Apache Karaf :: Cellar ::
Config (2.2.4.SNAPSHOT)
[ 201] [Active ] [Failure ] [ ] [ 80] Apache Karaf :: Cellar ::
Features (2.2.4.SNAPSHOT)
[ 202] [Active ] [GracePeriod ] [ ] [ 80] Apache Karaf :: Cellar ::
Bundle (2.2.4.SNAPSHOT)
[ 203] [Active ] [Failure ] [ ] [ 80] Apache Karaf :: Cellar ::
DOSGi (2.2.4.SNAPSHOT)
[ 204] [Active ] [Created ] [ ] [ 80] Apache Karaf :: Cellar ::
Utils (2.2.4.SNAPSHOT)
[ 205] [Resolved ] [ ] [ ] [ 80] Apache Karaf :: Cellar ::
Shell (2.2.4.SNAPSHOT)
[ 206] [Resolved ] [ ] [ ] [ 80] Apache Karaf :: Cellar ::
Management (2.2.4.SNAPSHOT)
tesb.log excerpt:
...
12:44:10,724 | WARN | ol-10-thread-130 | lar.core.event.EventDispatchTask 88
| 198 - org.apache.karaf.cellar.core - 2.2.4.SNAPSHOT | Failed to retrieve
handler for event class org.apache.karaf.cellar.features.RemoteFeaturesEvent
12:44:10,732 | WARN | ol-10-thread-131 | lar.core.event.EventDispatchTask 88
| 198 - org.apache.karaf.cellar.core - 2.2.4.SNAPSHOT | Failed to retrieve
handler for event class org.apache.karaf.cellar.features.RemoteFeaturesEvent
12:44:10,752 | WARN | ol-10-thread-132 | lar.core.event.EventDispatchTask 88
| 198 - org.apache.karaf.cellar.core - 2.2.4.SNAPSHOT | Failed to retrieve
handler for event class org.apache.karaf.cellar.features.RemoteFeaturesEvent
12:44:10,758 | WARN | ol-10-thread-133 | lar.core.event.EventDispatchTask 88
| 198 - org.apache.karaf.cellar.core - 2.2.4.SNAPSHOT | Failed to retrieve
handler for event class org.apache.karaf.cellar.features.RemoteFeaturesEvent
...
and finally:
12:48:41,257 | ERROR | rint Extender: 3 | ntainer.BlueprintContainerImpl$1 293
| 10 - org.apache.aries.blueprint - 0.3.1 | Unable to start blueprint container
for bundle org.apache.karaf.cellar.config due to unresolved dependencies
[(objectClass=org.apache.karaf.cellar.core.GroupManager)]
java.util.concurrent.TimeoutException
at
org.apache.aries.blueprint.container.BlueprintContainerImpl$1.run(BlueprintContainerImpl.java:287)[10:org.apache.aries.blueprint:0.3.1]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)[:1.6.0_30]
at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)[:1.6.0_30]
at java.util.concurrent.FutureTask.run(FutureTask.java:138)[:1.6.0_30]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)[:1.6.0_30]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)[:1.6.0_30]
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)[:1.6.0_30]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)[:1.6.0_30]
at java.lang.Thread.run(Thread.java:662)[:1.6.0_30]
- for all failed bundles.
Afterall, the container failed to go away cleanly (hung) with the following
exceptions duplicated through the log:
13:39:55,322 | WARN | cached.thread-73 | dardLoggerFactory$StandardLogger 51
| - - | 172.27.210.7/172.27.210.7:5704 [cellar] You probably have too long
Hazelcast configuration!
java.io.IOException: Invalid argument
at java.net.PlainDatagramSocketImpl.send(Native Method)[:1.6.0_30]
at java.net.DatagramSocket.send(DatagramSocket.java:625)[:1.6.0_30]
at
com.hazelcast.impl.MulticastService.send(MulticastService.java:148)[197:hazelcast:1.9.4.8]
at
com.hazelcast.impl.MulticastJoiner.searchForOtherClusters(MulticastJoiner.java:95)[197:hazelcast:1.9.4.8]
at
com.hazelcast.impl.SplitBrainHandler.searchForOtherClusters(SplitBrainHandler.java:58)[197:hazelcast:1.9.4.8]
at
com.hazelcast.impl.SplitBrainHandler.access$000(SplitBrainHandler.java:22)[197:hazelcast:1.9.4.8]
at
com.hazelcast.impl.SplitBrainHandler$1.doRun(SplitBrainHandler.java:46)[197:hazelcast:1.9.4.8]
at
com.hazelcast.impl.FallThroughRunnable.run(FallThroughRunnable.java:23)[197:hazelcast:1.9.4.8]
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)[:1.6.0_30]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)[:1.6.0_30]
at java.lang.Thread.run(Thread.java:662)[:1.6.0_30]
> Cellar causes Karaf container to freeze if system got network interface
> changes between container restarts
> ----------------------------------------------------------------------------------------------------------
>
> Key: KARAF-1309
> URL: https://issues.apache.org/jira/browse/KARAF-1309
> Project: Karaf
> Issue Type: Bug
> Components: cellar-hazelcast
> Affects Versions: cellar-2.2.4
> Environment: Karaf-2.2.6-SNAPSHOT from 20120403,
> Cellar-2.2.4-SNAPSHOT from 20120312
> Reporter: Alexey Bespaly
> Assignee: Jean-Baptiste Onofré
> Fix For: cellar-3.0.0, cellar-2.2.4
>
> Attachments: logs.tgz
>
>
> 1. Started 4 Karaf instances and installed Cellar on those
> cluster:node-list
> No. Host Name Port ID
> * 1 opti.local 5701 opti.local:5701
> 2 opti.local 5702 opti.local:5702
> 3 opti.local 5703 opti.local:5703
> 4 opti.local 5704 opti.local:5704
> opti.local = 192.168.1.86
> 2. 1,2 - default group; 3,4 - "group1" (not sure if groups are essential
> here, but were a part of the test case)
> cluster:group-list
> Node Group
> opti.local:5701 default
> opti.local:5702 default
> opti.local:5703 group1
> * opti.local:5704 group1
> 3. Stopped Karaf containers
> 4. Got the VPN client on - tun0 (172.27.210.11) network interface added
> 5. Restarted Karaf containers
> - the 1st and the 2nd ones seemed to be working fine:
> cluster:node-list
> No. Host Name Port ID
> * 1 172.27.210.11 5701 172.27.210.11:5701
> 2 172.27.210.11 5703 172.27.210.11:5703
> 3 172.27.210.11 5702 172.27.210.11:5702
> 4 172.27.210.11 5704 172.27.210.11:5704
> - the 3rd container got inresponsive:
> karaf@trun> osgi:list
> ...no response....
> - the 4th shows the following static picture:
> ...skipped...
> [ 211] [Active ] [Creating ] [ ] [ 60] hazelcast (1.9.4.6)
> Fragments: 213
> [ 212] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
> :: Core (2.2.4.SNAPSHOT)
> [ 213] [Resolved ] [ ] [ ] [ 60] Apache Karaf :: Cellar
> :: Hazelcast (2.2.4.SNAPSHOT)
> Hosts: 211
> [ 214] [Active ] [GracePeriod ] [ ] [ 60] Apache Karaf :: Cellar
> :: Config (2.2.4.SNAPSHOT)
> [ 215] [Active ] [GracePeriod ] [ ] [ 60] Apache Karaf :: Cellar
> :: Features (2.2.4.SNAPSHOT)
> [ 216] [Active ] [GracePeriod ] [ ] [ 60] Apache Karaf :: Cellar
> :: Bundle (2.2.4.SNAPSHOT)
> [ 217] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
> :: DOSGi (2.2.4.SNAPSHOT)
> [ 218] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
> :: Utils (2.2.4.SNAPSHOT)
> [ 219] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
> :: Shell (2.2.4.SNAPSHOT)
> [ 220] [Active ] [Created ] [ ] [ 60] Apache Karaf :: Cellar
> :: Management (2.2.4.SNAPSHOT)
> karaf@trun>
> ...and the status does not get changed over the time
> Could it be that the stored config conflicts with the newly detected one and
> brings such an instability in?
> Logs of 4 Karaf instances are attached.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira