Amit Gupta created SLING-2874:
---------------------------------

             Summary: JcrResourceProvider.create/JobMangerImpl.writeJob can 
cause inconsistent behavior
                 Key: SLING-2874
                 URL: https://issues.apache.org/jira/browse/SLING-2874
             Project: Sling
          Issue Type: Bug
          Components: JCR
    Affects Versions: JCR Resource 2.2.10
            Reporter: Amit Gupta


I have been debugging an issue with sling jobs and finally found that current 
implementation JobMangerImpl.writeJob can cause inconsistent behaviour (root 
cause is JcrResourceProvider.create)

Issue that led to this:
I observed that sometime all of my event properties were not being written to 
the job node. Though the job node was being created. But ultimately JobManager 
would error out giving following messages:
17.05.2013 16:41:57.578 *WARN* [Apache Sling Job Background Loader] 
org.apache.sling.event.impl.jobs.JobManagerImpl Discarding job - job topic is 
missing : 
/var/eventing/jobs/assigned/826cd21a-6a8f-48cb-b112-768b421af572/slingevent:eventadmin/2013/5/17/16/39/com.adobe.cq.collection.update.job_826cd21a-6a8f-48cb-b112-768b421af572_2
Sometime, my job handler would be called, but event won't have enough 
properties that I sent to jobManager.
Problem:
There was an issue in my code that was adding a property to the event, which 
had invalid key i.e. /a/b/c/a.txt and JcrResourceProvider can not persist it. 
Hence the issue. This is fine, I can correct it.

But the main problem is that this persistence error was never reported in error 
logs, and job got persisted event though JcrResourceProvider.create threw a 
PersistenceException. But the job was created with fewer properties with what I 
intended. This resulted in sometime, my JobHandler being called, but not 
getting enough properties.

With the debugging, I found that JobManagerImpl.writeJob can cause some 
inconsistent behaviour due to the way, ResourceUtil.getOrCreateResource and 
JcrResourceProvider.create.

In this case following happened:
JcrResourceProvider.create threw PE while persisting the property, but the node 
was already by this time.
ResourceUtil.getOrCreateResource caught the PE, but checked for the existence 
of resource and hence ignored it.

Now, above implementation is wrong, either JcrResourceProvider should ensure 
that operation is atomic. Or ResourceUtil.getOrCreateResource should be changed 
revert changes in case of exception.

I think that JcrResourceProvider should remove the node if addition of 
properties fails.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to