Re: Questions on new service engine and job poller changes

Adrian Crum Thu, 23 Aug 2012 09:56:44 -0700

On 8/23/2012 4:42 PM, Brett Palmer wrote:

*Adrian,


Thanks for the information.  Please see my questions inline:*

On Thu, Aug 23, 2012 at 6:24 AM, Adrian Crum <
[email protected]> wrote:

On 8/23/2012 8:46 AM, Adrian Crum wrote:

On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality for
our data warehouse processing.  We try to

  provide real time reports but our database cannot handle a high number of

data warehouse updates during heavy loads.   By configuring only one
server
to service a particular pool we can limit the number of concurrent
processes running those services.


         <thread-pool send-to-pool="pool"
                      purge-job-days="4"
                      failed-retry-min="3"
                      ttl="120000"
                      jobs="100"
                      min-threads="2"
                      max-threads="5"
                      wait-millis="1000"
                      poll-enabled="true"
                      poll-db-millis="30000">
             <run-from-pool name="pool"/>
             <run-from-pool name="dwPool"/>
         </thread-pool>


That configuration will work. That server will service the two pools.

I forgot to mention, if you're running lots of jobs, then you will want to
increase the jobs (queue size) value. You mentioned in another thread that
your application will run up to 10,000 jobs - in that case you should
increase the jobs value to 1000 or more. The queue size affects memory, so
there is an interaction between responsiveness and memory use.

*Thanks for the information that is very helpful.*

The potential problem with the Job Poller (before and after the overhaul)
is with asynchronous service calls (not scheduled jobs). When you run an
async service, the service engine converts the service call to a job and
places it in the queue. It is not persisted like scheduled jobs. If the Job
Poller has just filled the queue with scheduled jobs, then there is no room
for async services, and any attempt to queue an async service will fail
(throws an exception "Unable to queue job").

*I assume the “queue” is a memory queue and not the same as the JobSandBox
pool that is stored in the database which is why there is a limit to the
queue.  Let me know if that assumption is not correct.

That is correct. The queue size limit was put there to prevent the JobScheduler from saturating or crashing the server.

During a polling interval, the Job Manager will fill the queue with jobsscheduled to run. Any jobs that don't fit in the queue will be queuedduring the next polling interval. Queue service threads will run thequeued jobs. Creating too many queue service threads will slow downqueue throughput because of Thread maintenance overhead. So, there aresome parameters for users to tweak and they interact with each other,but the overall objective is to configure the Job Scheduler so that ithas good throughput but doesn't run out of control and swamp the server.


If you run an async service and set the “persist” option to true will you
still hit the Job Poller limit or will the job be persisted and run when
the Job Poller has sufficient resources?*

The async service will be persisted as a job scheduled to run now. Thejob will be in the pool specified in the <thread-pool> send-to-poolattribute.

I designed the new code so the service engine can check for that
possibility, but I didn't change the service engine behavior. Instead,
users should configure their <thread-pool> element(s) and applications
carefully. For example, if your application schedules lots of jobs, then
design it in a way that it schedules no more than (queue size - n) jobs at
a time - to leave room for async services. Another option would be to have
a server dedicated to servicing scheduled jobs - that way the potential
clash with async services is not an issue.

*I wasn’t aware that the same queue was shared between async jobs and
scheduled jobs - thanks again for the update.

We like the idea of dedicating an app server to service specific scheduled
jobs as it controls the number of concurrent processes we run in
production.

I’m still curious why the service engine dispatcher does not have an API to
run an async service to a specified “pool”.  This seems like a simple
addition since there is an API to schedule a job to run in a specific pool.
  I understand that there is potential this could fail if the queue is full
(unless my question above about the persisted job is a possible
workaround).

If persist is true, then the async service will be assigned to the poolspecified in the <thread-pool> send-to-pool attribute. If persist isfalse, then specifying a job pool would have no affect.

We could create an "async-service-only" queue that would be unaffectedby persisted jobs, but it can still be overrun. That's why I changed thecode to allow the service engine to check for that possibility. I don'tknow what OFBiz should do by default in those scenarios, so I thought itbest to leave the async service behavior the same (an exception isthrown). In other words, we could create the extra queue to give users awarm fuzzy feeling, but the same basic problem will still exist. Ibelieve it is best to make it clear that, because of their nature,non-persisted async services are not guaranteed to run.


 From your provided information here is how we would likely use the new
changes with the service engine and job poller:

Background: Our application is an online testing application with multiple
ofbiz servers and a single ofbiz data warehouse.  Tests are taken on the
dedicated app servers and when the test is done a data warehouse process
picks up the tests and processes them for the data warehouse reports.  The
reports are near real time but during heavy testing periods we want to
limit how many concurrent warehouse processes are running.  Here are the
steps in the process:

1. Configure a limited number of ofbiz servers to process scheduled data
warehouse jobs that are submitted to a specific job pool (i.e. dwPool).

2. When a person has completed a test the application creates a scheduled
job with a current timestamp for when the service should be run.  The
scheduled job would be assigned to the “dwPool”.  The servers configured in
item 1 above would then process these jobs.

That sounds like a good strategy. An improvement would be to have thejob servers service all pools. In that configuration the online testingapplication servers would have the <thread-pool> poll-enabled attributeset to "false" - so they will not run any jobs themselves. The onlybottleneck would be the data source - and that bottleneck can be fixedby putting the JobSandbox entity on a separate data source and use a"jobs only" delegator.


The above steps allow us to scale our solution horizontally by adding more
ofbiz servers to handle online testing as needed.  We are still able to
handle near real time reporting as we have dedicated servers assigned to
process data warehouse requests.  During light testing days the warehouse
scheduled jobs process almost immediately and during heavy testing days
they lag slightly depending on the service request rate.

Question:

If a scheduled job is set with a current timestamp for the “startTime”, but
the JobPoller is behind because of a large number of scheduled service
requests, will the JobPoller still pick up the scheduled job according to
the order of startTime?

Here is a specific example:

Current time:  Aug. 23, 10:00AM

- A schedule job is created with a start time of Aug. 23, 10:00AM
- JobPoller finishes processing current queue of jobs at timestamp:  Aug.
23, 10:05AM
- JobPoller queries data for the next list of jobs to process.

Question: Will it pick up the jobs scheduled for Aug. 23, 10:00AM even
though the current time is past that time?


Yes, the Job Manager will retrieve all jobs scheduled to start prior to now.

-Adrian

Re: Questions on new service engine and job poller changes

Reply via email to