Amit Gupta created SLING-2874:
---------------------------------
Summary: JcrResourceProvider.create/JobMangerImpl.writeJob can
cause inconsistent behavior
Key: SLING-2874
URL: https://issues.apache.org/jira/browse/SLING-2874
Project: Sling
Issue Type: Bug
Components: JCR
Affects Versions: JCR Resource 2.2.10
Reporter: Amit Gupta
I have been debugging an issue with sling jobs and finally found that current
implementation JobMangerImpl.writeJob can cause inconsistent behaviour (root
cause is JcrResourceProvider.create)
Issue that led to this:
I observed that sometime all of my event properties were not being written to
the job node. Though the job node was being created. But ultimately JobManager
would error out giving following messages:
17.05.2013 16:41:57.578 *WARN* [Apache Sling Job Background Loader]
org.apache.sling.event.impl.jobs.JobManagerImpl Discarding job - job topic is
missing :
/var/eventing/jobs/assigned/826cd21a-6a8f-48cb-b112-768b421af572/slingevent:eventadmin/2013/5/17/16/39/com.adobe.cq.collection.update.job_826cd21a-6a8f-48cb-b112-768b421af572_2
Sometime, my job handler would be called, but event won't have enough
properties that I sent to jobManager.
Problem:
There was an issue in my code that was adding a property to the event, which
had invalid key i.e. /a/b/c/a.txt and JcrResourceProvider can not persist it.
Hence the issue. This is fine, I can correct it.
But the main problem is that this persistence error was never reported in error
logs, and job got persisted event though JcrResourceProvider.create threw a
PersistenceException. But the job was created with fewer properties with what I
intended. This resulted in sometime, my JobHandler being called, but not
getting enough properties.
With the debugging, I found that JobManagerImpl.writeJob can cause some
inconsistent behaviour due to the way, ResourceUtil.getOrCreateResource and
JcrResourceProvider.create.
In this case following happened:
JcrResourceProvider.create threw PE while persisting the property, but the node
was already by this time.
ResourceUtil.getOrCreateResource caught the PE, but checked for the existence
of resource and hence ignored it.
Now, above implementation is wrong, either JcrResourceProvider should ensure
that operation is atomic. Or ResourceUtil.getOrCreateResource should be changed
revert changes in case of exception.
I think that JcrResourceProvider should remove the node if addition of
properties fails.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira