[
https://issues.apache.org/jira/browse/SLING-3432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14496460#comment-14496460
]
Timothee Maret commented on SLING-3432:
---------------------------------------
bq. The listener is informed that 'something is wrong in the topology' via the
TOPOLOGY_CHANGING event as soon as the discovery notices this.
I agree, though the information carried by the announce is different.
Without isolated mode, the instance can't differentiate between I. a topology
vote that takes a while and II. being disconnected from the topology.
Applications may benefit from this distinction, for instance to do operations
(registering to services, allocating resources, etc.) only when joining the
topology or leaving it.
Also, although I guess it was not meant to be implemented based on actual
waiting, the topology listeners would not have to wait on any particular event
at any time.
The topology listeners would do
{code}
handleTopologyEvent(TopologyEvent event) {
View view event.getNewView();
if ("isolated".equals(view.id())) {
// do things in isolated mode
} else {
// do things in connected mode
}
}
{code}
> pseudo network partition causes job deserialization issue in a cluster (when
> reading while job is being reassigned)
> -------------------------------------------------------------------------------------------------------------------
>
> Key: SLING-3432
> URL: https://issues.apache.org/jira/browse/SLING-3432
> Project: Sling
> Issue Type: Bug
> Components: Extensions
> Affects Versions: Discovery Impl 1.0.2
> Reporter: Stefan Egli
> Assignee: Stefan Egli
> Fix For: Discovery Impl 1.1.2
>
>
> There is a race condition between two instances in a cluster (eg oak or crx):
> Instance 1 is writing a job with a binary property, instance 2 is reading the
> job (likely triggered by discovery sending it a topologychangedevent). It
> looks like instance 2 is reading the job just about while instance 1 is still
> in the process or completely writing the job, or at least the binary.
> Resulting in the following exception:
> 04.03.2014 06:55:39.667 *WARN* [Apache Sling Job Background Loader]
> org.apache.sling.event.impl.jobs.JobManagerImpl Unable to read job from
> /var/eventing/jobs/assigned/e4337f8f-47d2-41df-b3ab-0d40b1b2acd4/slingevent:eventadmin/2014/3/3/8/45/cq.wcm.msm.job.pageEvent_9718d7db-85b4-4930-a2ba-11a80d772970_172
> java.lang.Exception: Unable to deserialize property 'pageEvent'
> at
> org.apache.sling.event.impl.support.ResourceHelper.cloneValueMap(ResourceHelper.java:213)
> at
> org.apache.sling.event.impl.jobs.JobManagerImpl.readJob(JobManagerImpl.java:538)
> at
> org.apache.sling.event.impl.jobs.BackgroundLoader.loadJobInTheBackground(BackgroundLoader.java:318)
> at
> org.apache.sling.event.impl.jobs.BackgroundLoader.loadJobsInTheBackground(BackgroundLoader.java:294)
> at
> org.apache.sling.event.impl.jobs.BackgroundLoader.run(BackgroundLoader.java:203)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.EOFException: null
> at
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2280)
> at
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2749)
> at
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:779)
> at java.io.ObjectInputStream.<init>(ObjectInputStream.java:279)
> at
> org.apache.sling.event.impl.support.ResourceHelper.cloneValueMap(ResourceHelper.java:208)
> ... 5 common frames omitted
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)