*Adrian, Thanks for the information. Please see my questions inline:*
On Thu, Aug 23, 2012 at 6:24 AM, Adrian Crum < [email protected]> wrote: > On 8/23/2012 8:46 AM, Adrian Crum wrote: > >> >> On 8/22/2012 7:04 PM, Brett Palmer wrote: We need this functionality for >> our data warehouse processing. We try to >> >> provide real time reports but our database cannot handle a high number of >>> data warehouse updates during heavy loads. By configuring only one >>> server >>> to service a particular pool we can limit the number of concurrent >>> processes running those services. >>> >>> >>> <thread-pool send-to-pool="pool" >>> purge-job-days="4" >>> failed-retry-min="3" >>> ttl="120000" >>> jobs="100" >>> min-threads="2" >>> max-threads="5" >>> wait-millis="1000" >>> poll-enabled="true" >>> poll-db-millis="30000"> >>> <run-from-pool name="pool"/> >>> <run-from-pool name="dwPool"/> >>> </thread-pool> >>> >> >> >> That configuration will work. That server will service the two pools. >> > > I forgot to mention, if you're running lots of jobs, then you will want to > increase the jobs (queue size) value. You mentioned in another thread that > your application will run up to 10,000 jobs - in that case you should > increase the jobs value to 1000 or more. The queue size affects memory, so > there is an interaction between responsiveness and memory use. > > *Thanks for the information that is very helpful.* > The potential problem with the Job Poller (before and after the overhaul) > is with asynchronous service calls (not scheduled jobs). When you run an > async service, the service engine converts the service call to a job and > places it in the queue. It is not persisted like scheduled jobs. If the Job > Poller has just filled the queue with scheduled jobs, then there is no room > for async services, and any attempt to queue an async service will fail > (throws an exception "Unable to queue job"). > > *I assume the “queue” is a memory queue and not the same as the JobSandBox pool that is stored in the database which is why there is a limit to the queue. Let me know if that assumption is not correct. If you run an async service and set the “persist” option to true will you still hit the Job Poller limit or will the job be persisted and run when the Job Poller has sufficient resources?* > I designed the new code so the service engine can check for that > possibility, but I didn't change the service engine behavior. Instead, > users should configure their <thread-pool> element(s) and applications > carefully. For example, if your application schedules lots of jobs, then > design it in a way that it schedules no more than (queue size - n) jobs at > a time - to leave room for async services. Another option would be to have > a server dedicated to servicing scheduled jobs - that way the potential > clash with async services is not an issue. > > *I wasn’t aware that the same queue was shared between async jobs and scheduled jobs - thanks again for the update. We like the idea of dedicating an app server to service specific scheduled jobs as it controls the number of concurrent processes we run in production. I’m still curious why the service engine dispatcher does not have an API to run an async service to a specified “pool”. This seems like a simple addition since there is an API to schedule a job to run in a specific pool. I understand that there is potential this could fail if the queue is full (unless my question above about the persisted job is a possible workaround). >From your provided information here is how we would likely use the new changes with the service engine and job poller: Background: Our application is an online testing application with multiple ofbiz servers and a single ofbiz data warehouse. Tests are taken on the dedicated app servers and when the test is done a data warehouse process picks up the tests and processes them for the data warehouse reports. The reports are near real time but during heavy testing periods we want to limit how many concurrent warehouse processes are running. Here are the steps in the process: 1. Configure a limited number of ofbiz servers to process scheduled data warehouse jobs that are submitted to a specific job pool (i.e. dwPool). 2. When a person has completed a test the application creates a scheduled job with a current timestamp for when the service should be run. The scheduled job would be assigned to the “dwPool”. The servers configured in item 1 above would then process these jobs. The above steps allow us to scale our solution horizontally by adding more ofbiz servers to handle online testing as needed. We are still able to handle near real time reporting as we have dedicated servers assigned to process data warehouse requests. During light testing days the warehouse scheduled jobs process almost immediately and during heavy testing days they lag slightly depending on the service request rate. Question: If a scheduled job is set with a current timestamp for the “startTime”, but the JobPoller is behind because of a large number of scheduled service requests, will the JobPoller still pick up the scheduled job according to the order of startTime? Here is a specific example: Current time: Aug. 23, 10:00AM - A schedule job is created with a start time of Aug. 23, 10:00AM - JobPoller finishes processing current queue of jobs at timestamp: Aug. 23, 10:05AM - JobPoller queries data for the next list of jobs to process. Question: Will it pick up the jobs scheduled for Aug. 23, 10:00AM even though the current time is past that time? Thanks in advance for your response. Brett*
