I find that anything not time based does not work when, like you said the numbers get large. I added the createtime to the conditions currently set in the milliseconds.
Brett Palmer sent the following on 7/14/2011 5:35 AM: > One feature that would help to prevent this problem in the future is a > configuration parameter in the service engine that would set the maximum > number of jobs the poller would process at a time. Right now the poller > reads the JobSandbox and gets every job that has a status of Pending. Then > it tries to change the status for each of these to running (or something > like that). If the number of pending jobs is too large the poller will time > out before it can change the state of all the pending jobs. Changing the > transaction timeout can help this problem but having another configuration > like "max-poll-jobs" could limit the number of pending jobs that are > processed in one transaction. There is a configuration called "jobs" but I > don't think that is used by the polling process. > > I've tried to use the service engine as an asynchronous batch server but run > into problems when the number of pending jobs gets around 10,000. > > > Brett > > On Wed, Jul 13, 2011 at 10:34 PM, BJ Freeman <bjf...@free-man.net> wrote: > >> you going to run into this from time to time or one reason or another. >> the approach I took was to spread the jobs out so they are not lumped >> together. >> take a look at how the jobs are Marshalled to be run. >> >> Josh Jacobson sent the following on 7/13/2011 8:35 PM: >>> Vacuum has been run, (took quite a while). Yeah, I see now that the >>> JobManager actually tries to update all the JobSandbox rows in the >>> transaction, so 60 seconds was pretty low. >>> >>> I am trying 10 minutes now and see how that goes. >>> >>> I am using postgress by the way. >>> >>> Thanks for the help, I really appreciate it. >>> >>> -- >>> Josh. >>> >>> On Wed, Jul 13, 2011 at 8:29 PM, Scott Gray <scott.g...@hotwaxmedia.com> >> wrote: >>>> Not sure what db you're using but it probably wouldn't hurt to run a >> vacuum on the table to speed up processing. >>>> >>>> By the way, I'm pretty sure the default timeout is 60 seconds so you >> might want to try something a little larger :-) >>>> >>>> Regards >>>> Scott >>>> >>>> On 14/07/2011, at 2:58 PM, Josh Jacobson wrote: >>>> >>>>> I tried 60 seconds for timeout but that didn't work. I guess Ill >>>>> double it now and keep trying. >>>>> >>>>> I have about 260,000 pending jobs, and nothing is getting done. >>>>> >>>>> I know what you mean about purgeOldjobs. That service is crashed now >>>>> and I deleted old jobs from the database by hand. I was up to 2.6 >>>>> million rows. Ofbiz was pretty much unusable. >>>>> >>>>> If you have any other suggestions I'd love Yo hear them. >>>>> >>>>> On Wednesday, July 13, 2011, Scott Gray <scott.g...@hotwaxmedia.com> >> wrote: >>>>>> Ah okay, that is entirely dependent on the number of jobs and the >> speed the server can process them. As a side note I would keep a close eye >> on the purgeOldJobs service, when it starts falling over (transaction >> timeout again) then the number of rows in the table will increase quickly >> which in turn will slow down polling. >>>>>> >>>>>> In general the whole persisted jobs implementation is a bit fragile, >> especially when dealing with a large number of jobs. I've wanted to replace >> it with something like quartz for a while but haven't had the time. >>>>>> >>>>>> Regards >>>>>> Scott >>>>>> >>>>>> On 14/07/2011, at 2:10 PM, Josh Jacobson wrote: >>>>>> >>>>>>> Thanks again. I actually meant a suggestion for the transaction >>>>>>> timeout. In any case I am grateful for your explanation. >>>>>>> >>>>>>> >>>>>>> On Wednesday, July 13, 2011, Scott Gray <scott.g...@hotwaxmedia.com> >> wrote: >>>>>>>> As best I can tell there shouldn't be any need to increase the >> interval between polls since the interval timer doesn't actually start until >> the previous poll has completed (see JobPoller.run()) so I can't see how a >> small interval would cause any backlog problems. >>>>>>>> >>>>>>>> I'm guessing if there is any lock contention then it's probably >> caused by the executing jobs trying to update their respective rows while >> the poller is holding a table lock. So from that point of view I guess >> increasing the interval could reduce the amount of contention between the >> executing jobs and the next poll. >>>>>>>> >>>>>>>> Regards >>>>>>>> Scott >>>>>>>> >>>>>>>> On 14/07/2011, at 1:02 PM, Josh Jacobson wrote: >>>>>>>> >>>>>>>>> Scott, >>>>>>>>> >>>>>>>>> Thanks! That is very precise advise. Do you have a suggestion on >>>>>>>>> interval time? 60 seconds? 120? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> On Wed, Jul 13, 2011 at 5:34 PM, Scott Gray < >> scott.g...@hotwaxmedia.com> wrote: >>>>>>>>>> That configuration is for the frequency of job polls. There isn't >> any ability to specify the transaction timeout via configuration so you'll >> need to modify the code directly: >>>>>>>>>> JobManager.java (line 148): >>>>>>>>>> beganTransaction = TransactionUtil.begin(); >>>>>>>>>> needs to be changed to use TransactionUtil.begin(int) >>>>>>>>>> >>>>>>>>>> Regards >>>>>>>>>> Scott >>>>>>>>>> >>>>>>>>>> HotWax Media >>>>>>>>>> http://www.hotwaxmedia.com >>>>>>>>>> >>>>>>>>>> On 14/07/2011, at 12:23 PM, Josh Jacobson wrote: >>>>>>>>>> >>>>>>>>>>> Brett, >>>>>>>>>>> >>>>>>>>>>> Before I start trying to run the jobs manually, I want to give >> your >>>>>>>>>>> suggestion a try. I think I know where to configure the job >> polling >>>>>>>>>>> transaction time (I believe it's the poll-db-millis="20000" value >> on >>>>>>>>>>> the framework/service/config/serviceengine.xml. >>>>>>>>>>> >>>>>>>>>>> However, I still don't know what to increase it to. I understand >> that >>>>>>>>>>> we wouldn't want to make it bigger than the default polling >> interval. >>>>>>>>>>> Do you know what the default interval between polling is? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> >>>>>>>>>>> On Wed, Jul 13, 2011 at 12:31 PM, Brett Palmer < >> brettgpal...@gmail.com> wrote: >>>>>>>>>>>> I meant removing finished jobs. If you have thousands of >> pending jobs then >>>>>>>>>>>> you will have the same problem I mentioned in my first email. >> One >>>>>>>>>>>> resolution will be to increase the job poller transaction time. >> In the >>>>>>>>>>>> ofbiz version I was using there was not a way to configure the >> poller >>>>>>>>>>>> transaction time. It just used the default time. I had to >> create a patch >>>>>>>>>>>> to allow this to happen. >>>>>>>>>>>> >>>>>>>>>>>> In the patch you had to be careful to not increase the >> transaction time >>>>>>>>>>>> greater than the frequency of the job poller. Otherwise you get >> into a lock >>>>>>>>>>>> situation where one job poller is still running within a >> transaction and >>>>>>>>>>>> another poller starts. This didn't create a huge problem but >> the second job >>>>>>>>>>>> poller would usually lock and then time out. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Brett >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jul 13, 2011 at 1:15 PM, Josh Jacobson < >> josh.s.jacob...@gmail.com>wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Brett, >>>>>>>>>>>>> >>>>>>>>>>>>> >>>> >>>> >>> >> >