Re: [Scheduler] Enable/Disable based on custom condition

Juri Berlanda Mon, 04 Feb 2019 05:20:06 -0800

Hi,

I still think it would be nice to just have a simple mechanism telling"Just don't start here".

I'm sceptic on a.) and b.) because they would introduce a databasebinding to DeltaSpike, which I think may make it hard to use in somestacks (think projects running on NoSQL databases). In addition, I thinksomething similar as you proposed in a.) can already be achieved byrunning Quartz in ClusteredMode, though I never tried that.

What I would propose is some pluggable Bean (via Alternative,Specializes or stuff like that) with 2 functions:


boolean isSchedulerEnabled();

boolean shouldJobBeStarted(Class<T>);

The default implementation would return true on both. Any Alternativecould then return false on isSchedulerEnabled() to fully disable it(lowering overall overhead in a scenario as mine), or do something smartin shouldJobBeStarted() to determine at Runtime whether a specific Jobshould be ran on the current node (should accomodate for your usecase).


What do you think?

Cheers,

Juri

On 1/30/19 9:13 AM, Mark Struberg wrote:

Hi folks!

Yes, that solution works.

I had the same problem (multiple nodes, code should only run once).
Pinning to one cluster node is a possible solution, but if that one node is 
down then the processing stops.
I went for another solution. I wrote some Interceptor which basically guards 
against a central DB.

There are 2 different strategies:
a.) Keep an pesimistic lock on a row in a DB. One row per Class or locking key. 
(DB must support row locking).
  Pro: easy to implement. Most work is done by the database
  Con: If the whole thread gets stuck then you're in back luck. One is also 
bound to the maximum transaction timeout.
   So if you want to have a task running for one hour (doing e.g. multiple 
transaction batches) you cannot use this strategy.

b.) Have a 'watchdog' table which remembers the node and whether active. 2 
Minutes not updated means that the task blew up and another node might take up.
  Pro: sufficiently easy to implement. A background thread which is a watchdog 
and uipdates the 'lastActive' timestamp in the DB in the background.
  Con: It takes a while for another node to pick up the work. Minimum 2 
minutes. We also need a clear 'nodeId'. That might be the IP, but if multiple 
JVMs run on the same box then you need some custom identifier. The JVM id would 
be a candidate as it is unique. Otoh a restart would mean 2 minutes pause.

c.) no database at all but network based locking. Might be easier or harder to 
setup depending on your infrastructure and security measures.

Should we implement any of these in DeltaSpike?
Which one makes more sense?
(I personally went for option b for my use cases)

LieGrue,
strub

Am 28.01.2019 um 13:28 schrieb Juri Berlanda <[email protected]>:

Hi,

just for completeness sake: I was able to achieve what I wanted using a custom 
config source. I presume this is not how it is supposed to be, but it works:

public class SchedulerConfigSource implements ConfigSource {
     private static final String CLASS_DEACTIVATOR_KEY = 
"org.apache.deltaspike.core.spi.activation.ClassDeactivator";
     private static final String CLASS_DEACTIVATOR_VALUE = 
"org.apache.deltaspike.core.impl.activation.DefaultClassDeactivator";
     private static final String SCHEDULER_DISABLED_KEY = 
"deactivate.org.apache.deltaspike.scheduler.impl.SchedulerExtension";

     private final int ordinal;

     SchedulerConfigSource(int ordinal) {
         this.ordinal = ordinal;
     }

     @Override
     public int getOrdinal() {
         return ordinal;
     }

     @Override
     public Map<String, String> getProperties() {
         return Stream.of(CLASS_DEACTIVATOR_KEY, SCHEDULER_DISABLED_KEY)
                 .collect(Collectors.toMap(Function.identity(), 
this::getPropertyValue));
     }

     @Override
     public String getPropertyValue(String key) {
         if (CLASS_DEACTIVATOR_KEY.equals(key))
             return CLASS_DEACTIVATOR_VALUE;
         if (SCHEDULER_DISABLED_KEY.equals(key))
             return Boolean.toString(!isSchedulerNode());
         return null;
     }

     private boolean isSchedulerNode() {
         // Evaluate the condition here
     }

     @Override
     public String getConfigName() {
         return "SchedulerConfigSource";
     }

     @Override
     public boolean isScannable() {
         return false;
     }
}

Thought I may as well add it if somebody else. And don't forget to register the 
ConfigSource.

I presume there is a better way to achieve this. If you know of one, please let 
me know.

Cheers,

Juri

On 1/25/19 3:56 PM, Juri Berlanda wrote:

I was able to achieve similar with Deltaspike's own Deactivable. It does work, 
i.e. I can set:

org.apache.deltaspike.core.spi.activation.ClassDeactivator=org.apache.deltaspike.core.impl.activation.DefaultClassDeactivator
deactivate.org.apache.deltaspike.scheduler.impl.SchedulerExtension=true

in my configs, and Scheduler stays off. But - as mentioned - I need this to be 
evaluated on system startup, not at Config level. So I tried implementing my 
own SchedulerExtension like:

public class MySchedulerExtension extends SchedulerExtension {
     @Override
     protected void init(@Observes BeforeBeanDiscovery beforeBeanDiscovery) {
         if (isSchedulerNode())
             super.init(beforeBeanDiscovery);
     }
}

I can register it via SPI and from the logs I see it is indeed initialized, 
while SchedulerExtension is not. Victory, right? Not quite...

I am testing this with OpenWebbeans 2.0.6., and I face the problem, that CDI now complains about ambiguous Bean for 
SchedulerExtension in SchedulerProducer (which I can see where that comes from), but I am just not able to exclude 
SchedulerProducer - I wouldn't even need it. I tried various combinations of @Specialize, @Alternative and 
<scan><exclude .../></scan>, but none of them seem to work. I guess the reason for it is the CDI 
1.0 beans.xml in scheduler-impl? Can anybody confirm? Would it be possible to move higher - 1.1 at least for 
getting support for <scan><exclude .../></scan>?

This leads me to the assumption, that scheduler-impl's SchedulerExtension is 
just not extensible at the moment. Or did anybody succeed in such an endeavor?

Since I do not want to patch the implementation, my next guess is to implement 
a custom ConfigSource, which evaluates isSchedulerNode() and sets 
deactivate.org.apache.deltaspike.scheduler.impl.SchedulerExtension accordingly. 
Does that make sense?

Kind regards,

Juri

On 1/24/19 9:04 PM, Alex Roytman wrote:

in my case i need to be able to turn it on/off on demand and I only have
couple of daily tasks so for me it was good enough
If if you just need to do it on startup by node type you could bind it to a
property
@Scheduled(cronExpression = "{reindex.schedule}")
public class ReindexTask implements org.quartz.Job {
...
and that property could probably be a cron expression which never fire on
all of your nodes but the scheduler
not nice but the whole thing is rather static - admittedly i did not dig
very deep


On Thu, Jan 24, 2019 at 2:44 PM Juri Berlanda <[email protected]>
wrote:

Thanks for the quick reply. I thought about that, but I don't like this
solution, since it involves to much boilerplate for my taste. In
addition, I find it cumbersome having to add these 2 lines in every
single task. I also thought about having an abstract base class for this
purpose, but I'm not happy with the solution...

In short: I hoped for a cleaner solution.

On 1/24/19 7:03 PM, Alex Roytman wrote:

Let the scheduler run and execute your task  but inside of the task

itself

check if you want to execute your logic or short circuit it to noop.

Since

you do not run it often should not be an overhead and it will let you

fail

over for any mode to execute it as long as you have a mechanism to lock

on

something and record execution result to avoid simultaneous execution or
double exexution

On Thu, Jan 24, 2019, 12:37 PM Juri Berlanda <[email protected]
wrote:

Hi,

I am currently trying to implement scheduled jobs using DeltaSpike's
Scheduler module, and I really like how little boilerplate I need for
getting it up and running.

Our application runs on multiple nodes, but the tasks are very
inexpensive, run only once a day, and I don't need failover - if they
fail once, and succeed the day after its totally fine. Therefore I'd
like to avoid setting up Quartz in clustered mode. But I still want the
Jobs to only run once. So my idea was to restrict the execution of the
jobs to a single scheduler node.

So my question is: Is it possible to somehow hook into the Scheduler
module to say something like:

if (isSchedulerNode())
      startScheduler();
else
      doNothing();

It would be perfect if said isSchedulerNode() could be evaluated on
system startup (e.g. acquire a distributed lock) and would not rely on
static values (e.g. config files, environment variables, etc.).

I can see how this is a bad idea in general (no load-balancing, no
failover) and I do have some ideas on how I would implement that. But
for these jobs I just don't care about any of this, so I'd like to avoid
having to set up a whole lot of infrastructure around my application
just to see this working.

Is there a possibility to achieve this without patching
deltaspike-scheduler-module-impl?

Kind regards,

Juri

Re: [Scheduler] Enable/Disable based on custom condition

Reply via email to