Re: Questions on new service engine and job poller changes

Brett Palmer Thu, 23 Aug 2012 08:42:29 -0700

*Adrian,

Thanks for the information.  Please see my questions inline:*

On Thu, Aug 23, 2012 at 6:24 AM, Adrian Crum <
[email protected]> wrote:

> On 8/23/2012 8:46 AM, Adrian Crum wrote:
>
>>
>> On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality for
>> our data warehouse processing.  We try to
>>
>>  provide real time reports but our database cannot handle a high number of
>>> data warehouse updates during heavy loads.   By configuring only one
>>> server
>>> to service a particular pool we can limit the number of concurrent
>>> processes running those services.
>>>
>>>
>>>         <thread-pool send-to-pool="pool"
>>>                      purge-job-days="4"
>>>                      failed-retry-min="3"
>>>                      ttl="120000"
>>>                      jobs="100"
>>>                      min-threads="2"
>>>                      max-threads="5"
>>>                      wait-millis="1000"
>>>                      poll-enabled="true"
>>>                      poll-db-millis="30000">
>>>             <run-from-pool name="pool"/>
>>>             <run-from-pool name="dwPool"/>
>>>         </thread-pool>
>>>
>>
>>
>> That configuration will work. That server will service the two pools.
>>
>
> I forgot to mention, if you're running lots of jobs, then you will want to
> increase the jobs (queue size) value. You mentioned in another thread that
> your application will run up to 10,000 jobs - in that case you should
> increase the jobs value to 1000 or more. The queue size affects memory, so
> there is an interaction between responsiveness and memory use.
>
>
*Thanks for the information that is very helpful.*

> The potential problem with the Job Poller (before and after the overhaul)
> is with asynchronous service calls (not scheduled jobs). When you run an
> async service, the service engine converts the service call to a job and
> places it in the queue. It is not persisted like scheduled jobs. If the Job
> Poller has just filled the queue with scheduled jobs, then there is no room
> for async services, and any attempt to queue an async service will fail
> (throws an exception "Unable to queue job").
>
>
*I assume the “queue” is a memory queue and not the same as the JobSandBox
pool that is stored in the database which is why there is a limit to the
queue.  Let me know if that assumption is not correct.

If you run an async service and set the “persist” option to true will you
still hit the Job Poller limit or will the job be persisted and run when
the Job Poller has sufficient resources?*

> I designed the new code so the service engine can check for that
> possibility, but I didn't change the service engine behavior. Instead,
> users should configure their <thread-pool> element(s) and applications
> carefully. For example, if your application schedules lots of jobs, then
> design it in a way that it schedules no more than (queue size - n) jobs at
> a time - to leave room for async services. Another option would be to have
> a server dedicated to servicing scheduled jobs - that way the potential
> clash with async services is not an issue.
>
>
*I wasn’t aware that the same queue was shared between async jobs and
scheduled jobs - thanks again for the update.

We like the idea of dedicating an app server to service specific scheduled
jobs as it controls the number of concurrent processes we run in
production.

I’m still curious why the service engine dispatcher does not have an API to
run an async service to a specified “pool”.  This seems like a simple
addition since there is an API to schedule a job to run in a specific pool.
 I understand that there is potential this could fail if the queue is full
(unless my question above about the persisted job is a possible
workaround).

>From your provided information here is how we would likely use the new
changes with the service engine and job poller:

Background: Our application is an online testing application with multiple
ofbiz servers and a single ofbiz data warehouse.  Tests are taken on the
dedicated app servers and when the test is done a data warehouse process
picks up the tests and processes them for the data warehouse reports.  The
reports are near real time but during heavy testing periods we want to
limit how many concurrent warehouse processes are running.  Here are the
steps in the process:

1. Configure a limited number of ofbiz servers to process scheduled data
warehouse jobs that are submitted to a specific job pool (i.e. dwPool).

2. When a person has completed a test the application creates a scheduled
job with a current timestamp for when the service should be run.  The
scheduled job would be assigned to the “dwPool”.  The servers configured in
item 1 above would then process these jobs.

The above steps allow us to scale our solution horizontally by adding more
ofbiz servers to handle online testing as needed.  We are still able to
handle near real time reporting as we have dedicated servers assigned to
process data warehouse requests.  During light testing days the warehouse
scheduled jobs process almost immediately and during heavy testing days
they lag slightly depending on the service request rate.

Question:

If a scheduled job is set with a current timestamp for the “startTime”, but
the JobPoller is behind because of a large number of scheduled service
requests, will the JobPoller still pick up the scheduled job according to
the order of startTime?

Here is a specific example:

Current time:  Aug. 23, 10:00AM

- A schedule job is created with a start time of Aug. 23, 10:00AM
- JobPoller finishes processing current queue of jobs at timestamp:  Aug.
23, 10:05AM
- JobPoller queries data for the next list of jobs to process.

Question: Will it pick up the jobs scheduled for Aug. 23, 10:00AM even
though the current time is past that time?

Thanks in advance for your response.

Brett*

Re: Questions on new service engine and job poller changes

Reply via email to