Re: JobManager failing to schedule jobs

BJ Freeman Wed, 13 Jul 2011 21:35:41 -0700

you going to run into this from time to time or one reason or another.
the approach I took was to spread the jobs out so they are not lumped
together.
take a look at how the jobs are Marshalled to be run.


Josh Jacobson sent the following on 7/13/2011 8:35 PM:
> Vacuum has been run, (took quite a while). Yeah, I see now that the
> JobManager actually tries to update all the JobSandbox rows in the
> transaction, so 60 seconds was pretty low.
> 
> I am trying 10 minutes now and see how that goes.
> 
> I am using postgress by the way.
> 
> Thanks for the help, I really appreciate it.
> 
> --
> Josh.
> 
> On Wed, Jul 13, 2011 at 8:29 PM, Scott Gray <scott.g...@hotwaxmedia.com> 
> wrote:
>> Not sure what db you're using but it probably wouldn't hurt to run a vacuum 
>> on the table to speed up processing.
>>
>> By the way, I'm pretty sure the default timeout is 60 seconds so you might 
>> want to try something a little larger :-)
>>
>> Regards
>> Scott
>>
>> On 14/07/2011, at 2:58 PM, Josh Jacobson wrote:
>>
>>> I tried 60 seconds for timeout but that didn't work. I guess Ill
>>> double it now and keep trying.
>>>
>>> I have about 260,000 pending jobs, and nothing is getting done.
>>>
>>> I know what you mean about purgeOldjobs. That service is crashed now
>>> and I deleted old jobs from the database by hand. I was up to 2.6
>>> million rows. Ofbiz was pretty much unusable.
>>>
>>> If you have any other suggestions I'd love Yo hear them.
>>>
>>> On Wednesday, July 13, 2011, Scott Gray <scott.g...@hotwaxmedia.com> wrote:
>>>> Ah okay, that is entirely dependent on the number of jobs and the speed 
>>>> the server can process them.  As a side note I would keep a close eye on 
>>>> the purgeOldJobs service, when it starts falling over (transaction timeout 
>>>> again) then the number of rows in the table will increase quickly which in 
>>>> turn will slow down polling.
>>>>
>>>> In general the whole persisted jobs implementation is a bit fragile, 
>>>> especially when dealing with a large number of jobs.  I've wanted to 
>>>> replace it with something like quartz for a while but haven't had the time.
>>>>
>>>> Regards
>>>> Scott
>>>>
>>>> On 14/07/2011, at 2:10 PM, Josh Jacobson wrote:
>>>>
>>>>> Thanks again. I actually meant a suggestion for the transaction
>>>>> timeout. In any case I am grateful for your explanation.
>>>>>
>>>>>
>>>>> On Wednesday, July 13, 2011, Scott Gray <scott.g...@hotwaxmedia.com> 
>>>>> wrote:
>>>>>> As best I can tell there shouldn't be any need to increase the interval 
>>>>>> between polls since the interval timer doesn't actually start until the 
>>>>>> previous poll has completed (see JobPoller.run()) so I can't see how a 
>>>>>> small interval would cause any backlog problems.
>>>>>>
>>>>>> I'm guessing if there is any lock contention then it's probably caused 
>>>>>> by the executing jobs trying to update their respective rows while the 
>>>>>> poller is holding a table lock.  So from that point of view I guess 
>>>>>> increasing the interval could reduce the amount of contention between 
>>>>>> the executing jobs and the next poll.
>>>>>>
>>>>>> Regards
>>>>>> Scott
>>>>>>
>>>>>> On 14/07/2011, at 1:02 PM, Josh Jacobson wrote:
>>>>>>
>>>>>>> Scott,
>>>>>>>
>>>>>>> Thanks! That is very precise advise. Do you have a suggestion on
>>>>>>> interval time? 60 seconds? 120?
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray 
>>>>>>> <scott.g...@hotwaxmedia.com> wrote:
>>>>>>>> That configuration is for the frequency of job polls.  There isn't any 
>>>>>>>> ability to specify the transaction timeout via configuration so you'll 
>>>>>>>> need to modify the code directly:
>>>>>>>> JobManager.java (line 148):
>>>>>>>> beganTransaction = TransactionUtil.begin();
>>>>>>>> needs to be changed to use TransactionUtil.begin(int)
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Scott
>>>>>>>>
>>>>>>>> HotWax Media
>>>>>>>> http://www.hotwaxmedia.com
>>>>>>>>
>>>>>>>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote:
>>>>>>>>
>>>>>>>>> Brett,
>>>>>>>>>
>>>>>>>>> Before I start trying to run the jobs manually, I want to give your
>>>>>>>>> suggestion a try. I think I know where to configure the job polling
>>>>>>>>> transaction time (I believe it's the poll-db-millis="20000" value on
>>>>>>>>> the framework/service/config/serviceengine.xml.
>>>>>>>>>
>>>>>>>>> However, I still don't know what to increase it to. I understand that
>>>>>>>>> we wouldn't want to make it bigger than the default polling interval.
>>>>>>>>> Do you know what the default interval between polling is?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer 
>>>>>>>>> <brettgpal...@gmail.com> wrote:
>>>>>>>>>> I meant removing finished jobs.  If you have thousands of pending 
>>>>>>>>>> jobs then
>>>>>>>>>> you will have the same problem I mentioned in my first email.  One
>>>>>>>>>> resolution will be to increase the job poller transaction time.  In 
>>>>>>>>>> the
>>>>>>>>>> ofbiz version I was using there was not a way to configure the poller
>>>>>>>>>> transaction time.  It just used the default time.  I had to create a 
>>>>>>>>>> patch
>>>>>>>>>> to allow this to happen.
>>>>>>>>>>
>>>>>>>>>> In the patch you had to be careful to not increase the transaction 
>>>>>>>>>> time
>>>>>>>>>> greater than the frequency of the job poller.  Otherwise you get 
>>>>>>>>>> into a lock
>>>>>>>>>> situation where one job poller is still running within a transaction 
>>>>>>>>>> and
>>>>>>>>>> another poller starts.  This didn't create a huge problem but the 
>>>>>>>>>> second job
>>>>>>>>>> poller would usually lock and then time out.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Brett
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson 
>>>>>>>>>> <josh.s.jacob...@gmail.com>wrote:
>>>>>>>>>>
>>>>>>>>>>> Brett,
>>>>>>>>>>>
>>>>>>>>>>>
>>
>>
>

Re: JobManager failing to schedule jobs

Reply via email to