Re: Proposal: Clustered Tasks via Locking

Allen Gilliland Wed, 11 Oct 2006 08:52:22 -0700

I have one item with this proposal that could use an opinion. I havecompleted the code for this and tested that it all works, but I am alittle at odds about where the methods for acquireLock() andreleaseLock() should go. There are 2 options ...

1. Put the code in a new Manager called TaskLockManager like I mentionedin the proposal. The pro is that it's clean and isolated, but the conis that it may be extraneous and non-cohesive.

2. Put the code in the existing ThreadManager. The pro here is that wedon't need a new Manager class and it makes a lot of sense from acohesion point of view. The con is that then the ThreadManager becomesa persistent class which makes it a little more confusing since it mixespersistent and non-persistent behavior.


I think right now I prefer #2, but either way is fine.

-- Allen


Allen Gilliland wrote:

Okay, one more time, if anyone is interested ...

http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_ClusteredTasksViaLocking


that's what I'm planning to do.

-- Allen


Allen Gilliland wrote:

I didn't hear opinions on this from anyone other than Elias, so Ithink I am going to go ahead and plan to make a table for trackingtasks/threads and setup the new clustered tasks code to consult thattable to do locking, etc.


-- Allen


Elias Torres wrote:


Allen Gilliland wrote:


Elias Torres wrote:

Not sure if we need a whole extra table. wouldn't task.name.propName =
value be enough?

well, that's possible i suppose, but then we are stuffing a lot of
information into that single property and forcing ourselves to parse it

out. capturing the various states then becomes a bit moreconfusing, i.e.


task.planet.refreshEntries = 987523897, 78345897238, 1800 (locked)
task.planet.refreshEntries = 872837348 (unlocked)


That's not what I meant, it was more like:

task.planetRefreshEntries.status = locked
task.planetRefreshEntries[worker1].lastRun = 987523897
...

and that assumes that we wouldn't want to extend that data at all,
because if we did want to track some kind of other data then wewould bescrewed. i was thinking that it may be nice to capture some basicerror
status from the tasks in the event that they fail for some reason, in
which case you may not want to keep running a task if it's failed 3
times in a row.  so if a task has a 'Status' attribute which could
indicate the state of the task like running, paused, error, waiting,etc.


... but a table is fine too.

-Elias

Another problem I thought would be that always the same node gets to
perform the task which also brings the synchronized clock issue again.

depends on the task, but that would probably be true.  i'm not sure why
that's an issue though.  if we do setup a separate table for tracking
the tasks and their state then it becomes easier to work on some sql
code to the use the db time rather than the cluster nodes time.

doing that given the old proposal kind of defeats the purposebecause in

the old proposal the point was to implement it in a way that didn't
require work on the backend because we were just using a simple runtime
property, but if we are going to do work on the backend then we should
probably just build out a more complete solution.

-- Allen

-Elias

Allen Gilliland wrote:

I had thought about this a little more and one thing that I wasplanningto neglect because of time was the fact that simply using a lockdoesn'tprevent a task from being run too frequently. i.e. if a task ismeantto run once an hour and you have 2 members in your cluster theymay havebeen started 30mins apart from each other so that the end resultis that

the task runs every 30 mins.

Perhaps a better solution to this would be to actually create asimple"tasks" table which can maintain one row per task and keep trackof ...


1. when the task was last run
2. if the task is currently running (locked)
3. how long the current running task is leased for
4. ?? possible others (how it was triggered, etc)

This would actually work much better at synchronizing the runningof thetasks so that if you had 10 cluster members running the taskitself is

still only run at the correct interval.

Should I update the proposal to work this way?

-- Allen


Anil Gangolli wrote:

The backend db-specific SQL required may be worthwhile.

Synchronization can't assumed to be perfect, so if you do use the

webapp host's time, you will want to delay lease grabbing untilaftersome grace period that is larger than a max (assumed) clockdifference

between the hosts (or conversely require an extending renewal by the
lease holder if the remaining lease time is lower than that).

--a.


----- Original Message ----- From: "Allen Gilliland"
<[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Thursday, September 21, 2006 10:40 AM
Subject: Re: Proposal: Clustered Tasks via Locking

I was actually thinking that a lazier approach of "that's the sys
admin's job" was preferable.  Setting up a way to get the current
time from the db requires backend work and i would prefer not to do

that if we can. My expectation is that anyone who has a largeenough

installation to need 2+ servers working in a cluster should also be
able to make sure that each cluster member has synchronized time.
Most other clustering software has that same expectation.

-- Allen


Elias Torres wrote:

One more thing.. I think you'd need to come up with a SQL querythatwould test for the lock using the server time and not the appserver

time. Basically a SELECT and INSERT/UPDATE together using
CURRENT_TIME()
or whatever to make sure we don't run into clock drifts.

-Elias

Allen Gilliland wrote:

Elias Torres wrote:

I like the proposal and I think it's very important/useful.

I would suggest though to not use a hard-coded expiration
mechanism and
instead use a leasing mechanism. I propose that a task says it
needs the
lock for X number of minutes/hours and writes the time it
started and
the lease amount. It just a subtle tweak, but it optimizes the

scheduling a bit, so a quick task like saving referrers canget a

3-min
lease and not block 3 hours of thread time. Additionally, they
could
store the name of the task, so parallel tasks can work w/o
blocking each

other and only tasks with the same service name wait on eachother.Obviously, a task can extend their lease if needed to run formore

time.

For example, let's store this as the
property: task.indexing value: 12:00:01,3mins

yep, I can do it that way. I guess I consider this to be thesame

thing
because the lease time for a given task is not likely to ever
change, so
if the task knows what the lease time is for its lock then there
is no
reason the lease needs to be in the db.  Obviously if the lease
time may
vary for a given lock then your approach makes a lot more sense.

Either way will work, but yours is slightly more flexible so I'll
do it
that way.  For the actual property I am going to simplify the
value so
that it's just long<time>, long<lease>.

So if you see the lease in the db it would be
property: task.indexing value: 983472893, 1800

This way it's just easier for the application to use the values
without
actually having to worry about parsing date strings.

Thanks for the suggestion.

-- Allen

In other words, let's re-invent JINI.

-Elias

Allen Gilliland wrote:

This is a really short one, but I did a proposal anyways.  I'd
like to
add a simple locking mechanism to the various background tasks
that we

have so that running them in clustered environments is safefrom

synchronization issues and we can prevent a task from running
at the
same time on multiple machines in the cluster.

http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_ClusteredTasksViaLocking






Since this is such a short proposal I'd like to go ahead and
propose a
vote on the proposal as is, since I don't expect there is a
need for
lots of discussion.  This would go into Roller 3.1.

-- Allen

Re: Proposal: Clustered Tasks via Locking

Reply via email to