Hi Devs,

I have been debugging an issue with sling jobs and finally found that current 
implementation JobMangerImpl.writeJob can cause inconsistent behaviour (root 
cause is JcrResourceProvider.create)

Issue that led to this:
I observed that sometime all of my event properties were not being written to 
the job node. Though the job node was being created. But ultimately JobManager 
would error out giving following messages:
17.05.2013 16:41:57.578 *WARN* [Apache Sling Job Background Loader] 
org.apache.sling.event.impl.jobs.JobManagerImpl Discarding job - job topic is 
missing : 
/var/eventing/jobs/assigned/826cd21a-6a8f-48cb-b112-768b421af572/slingevent:eventadmin/2013/5/17/16/39/com.adobe.cq.collection.update.job_826cd21a-6a8f-48cb-b112-768b421af572_2
Sometime, my job handler would be called, but event won't have enough 
properties that I sent to jobManager.
Problem:
There was an issue in my code that was adding a property to the event, which 
had invalid key i.e. /a/b/c/a.txt and JcrResourceProvider can not persist it. 
Hence the issue. This is fine, I can correct it.

But the main problem is that this persistence error was never reported in error 
logs, and job got persisted event though JcrResourceProvider.create threw a 
PersistenceException. But the job was created with fewer properties with what I 
intended. This resulted in sometime, my JobHandler being called, but not 
getting enough properties.

With the debugging, I found that JobManagerImpl.writeJob can cause some 
inconsistent behaviour due to the way, ResourceUtil.getOrCreateResource and 
JcrResourceProvider.create.

In this case following happened:
JcrResourceProvider.create threw PE while persisting the property, but the node 
was already by this time.
ResourceUtil.getOrCreateResource caught the PE, but checked for the existence 
of resource and hence ignored it.

Now, above implementation is wrong, either JcrResourceProvider should ensure 
that operation is atomic. Or ResourceUtil.getOrCreateResource should be changed 
revert changes in case of exception.

WDYT?

Thanks,
Amit

Reply via email to