Not sure if we need a whole extra table. wouldn't task.name.propName =
value be enough?

Another problem I thought would be that always the same node gets to
perform the task which also brings the synchronized clock issue again.

-Elias

Allen Gilliland wrote:
> I had thought about this a little more and one thing that I was planning
> to neglect because of time was the fact that simply using a lock doesn't
> prevent a task from being run too frequently.  i.e. if a task is meant
> to run once an hour and you have 2 members in your cluster they may have
> been started 30mins apart from each other so that the end result is that
> the task runs every 30 mins.
> 
> Perhaps a better solution to this would be to actually create a simple
> "tasks" table which can maintain one row per task and keep track of ...
> 
> 1. when the task was last run
> 2. if the task is currently running (locked)
> 3. how long the current running task is leased for
> 4. ?? possible others (how it was triggered, etc)
> 
> This would actually work much better at synchronizing the running of the
> tasks so that if you had 10 cluster members running the task itself is
> still only run at the correct interval.
> 
> Should I update the proposal to work this way?
> 
> -- Allen
> 
> 
> Anil Gangolli wrote:
>>
>> The backend db-specific SQL required may be worthwhile.
>>
>> Synchronization can't assumed to be perfect, so if you do use the
>> webapp host's time, you will want to delay lease grabbing until after
>> some grace period that is larger than a max (assumed) clock difference
>> between the hosts (or conversely require an extending renewal by the
>> lease holder if the remaining lease time is lower than that).
>>
>> --a.
>>
>>
>> ----- Original Message ----- From: "Allen Gilliland"
>> <[EMAIL PROTECTED]>
>> To: <[email protected]>
>> Sent: Thursday, September 21, 2006 10:40 AM
>> Subject: Re: Proposal: Clustered Tasks via Locking
>>
>>
>>> I was actually thinking that a lazier approach of "that's the sys
>>> admin's job" was preferable.  Setting up a way to get the current
>>> time from the db requires backend work and i would prefer not to do
>>> that if we can.  My expectation is that anyone who has a large enough
>>> installation to need 2+ servers working in a cluster should also be
>>> able to make sure that each cluster member has synchronized time. 
>>> Most other clustering software has that same expectation.
>>>
>>> -- Allen
>>>
>>>
>>> Elias Torres wrote:
>>>> One more thing.. I think you'd need to come up with a SQL query that
>>>> would test for the lock using the server time and not the app server
>>>> time. Basically a SELECT and INSERT/UPDATE together using
>>>> CURRENT_TIME()
>>>> or whatever to make sure we don't run into clock drifts.
>>>>
>>>> -Elias
>>>>
>>>> Allen Gilliland wrote:
>>>>>
>>>>> Elias Torres wrote:
>>>>>> I like the proposal and I think it's very important/useful.
>>>>>>
>>>>>> I would suggest though to not use a hard-coded expiration
>>>>>> mechanism and
>>>>>> instead use a leasing mechanism. I propose that a task says it
>>>>>> needs the
>>>>>> lock for X number of minutes/hours and writes the time it started and
>>>>>> the lease amount. It just a subtle tweak, but it optimizes the
>>>>>> scheduling a bit, so a quick task like saving referrers can get a
>>>>>> 3-min
>>>>>> lease and not block 3 hours of thread time. Additionally, they could
>>>>>> store the name of the task, so parallel tasks can work w/o
>>>>>> blocking each
>>>>>> other and only tasks with the same service name wait on each other.
>>>>>> Obviously, a task can extend their lease if needed to run for more
>>>>>> time.
>>>>>>
>>>>>> For example, let's store this as the
>>>>>> property: task.indexing value: 12:00:01,3mins
>>>>> yep, I can do it that way.  I guess I consider this to be the same
>>>>> thing
>>>>> because the lease time for a given task is not likely to ever
>>>>> change, so
>>>>> if the task knows what the lease time is for its lock then there is no
>>>>> reason the lease needs to be in the db.  Obviously if the lease
>>>>> time may
>>>>> vary for a given lock then your approach makes a lot more sense.
>>>>>
>>>>> Either way will work, but yours is slightly more flexible so I'll
>>>>> do it
>>>>> that way.  For the actual property I am going to simplify the value so
>>>>> that it's just long<time>, long<lease>.
>>>>>
>>>>> So if you see the lease in the db it would be
>>>>> property: task.indexing value: 983472893, 1800
>>>>>
>>>>> This way it's just easier for the application to use the values
>>>>> without
>>>>> actually having to worry about parsing date strings.
>>>>>
>>>>> Thanks for the suggestion.
>>>>>
>>>>> -- Allen
>>>>>
>>>>>
>>>>>> In other words, let's re-invent JINI.
>>>>>>
>>>>>> -Elias
>>>>>>
>>>>>> Allen Gilliland wrote:
>>>>>>> This is a really short one, but I did a proposal anyways.  I'd
>>>>>>> like to
>>>>>>> add a simple locking mechanism to the various background tasks
>>>>>>> that we
>>>>>>> have so that running them in clustered environments is safe from
>>>>>>> synchronization issues and we can prevent a task from running at the
>>>>>>> same time on multiple machines in the cluster.
>>>>>>>
>>>>>>> http://rollerweblogger.org/wiki/Wiki.jsp?page=Proposal_ClusteredTasksViaLocking
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Since this is such a short proposal I'd like to go ahead and
>>>>>>> propose a
>>>>>>> vote on the proposal as is, since I don't expect there is a need for
>>>>>>> lots of discussion.  This would go into Roller 3.1.
>>>>>>>
>>>>>>> -- Allen
>>>>>>>
>>>
>>
> 

Reply via email to