Elias Torres wrote:
On 11/22/06, Allen Gilliland <[EMAIL PROTECTED]> wrote:


Elias Torres wrote:
> Folks,
>
> I have been doing some work on clustered search (more on future email)
> and I begun looking at the task lock recently implemented in 3.1. I
> believe (and have verified with Allen) that we might need to make some
> changes to both the API and implementation so it works properly on a
> cluster of size 2 or more. Currently I think we can have a couple of
> places where two nodes can obtain a lock at the same time.
>
> I'm working on some pseudo-code you all can review to make sure we are
> safe from these issues above. I also want to refactor the interfaces to
> reflect more the use of leases as opposed to locks.
>
> interface LeaseManager {
>
>   registerLease(String name, String id, long leaseTime);
>   renewLease(String name, String id);
>   unregisterLease(String name, String id);
>
> }

I don't mind any of these changes if it's going to improve the
implementation, but now that I look at it I don't see how this is all
that different from what we have now ...

acquireLock(RollerTask)
releaseLock(RollerTask)
isLocked(RollerTask)

What's the use case for isLocked? I'm not sure why would I need to
test if something is locked. If I need it, I try to acquire it.

Ummm ... well for just doing the leasing it's probably not required. I think the reason I wanted it was for better condition handling so that it's possible that we could want to do something special in the event that a task is trying to run but is currently locked. If all you do is try to acquire the lock and fail you may not know that it failed because it was locked, it could have failed because you couldn't contact the db, or because the interval time hadn't elapsed yet.

We could get by without it though.




I can see how just calling things a lease is possibly a better way to
describe what's happening, but calling registerLease() is the exact same
thing as acquiring a lock in the current code.  When you acquire a lock
now you get the lock for a given amount of time, or until you manually
release it.

I didn't say that changing the name would mean different function, but
we must be consistent in our naming. JINI doesn't call it a
LockManager, it calls it a LeaseManager. If anything, I know you are
always a proponent of naming things what they do/mean.

Fair enough. I guess I just don't see that just because JINI calls it a lease that we have to call it the same thing. But it doesn't really matter to me, calling it a lease is fine.




The current code doesn't have any way to renew a lease (or lock), so
that definitely sounds like a good addition if we plan to have tasks
which may run over their initial lease time and want to extend the
lease. I didn't really envision that happening, but it sounds reasonable.


k. cool beans.

And of course, unregistering a lease is the same thing as releasing a
lock, right?  So all in all I don't think we are really changing much.

But in any case, I have a few comments ...

1. I would prefer that these just remain in the ThreadManager where the
existing methods are now.  I don't think we need a whole new Manager for
this.

I'm fine with them staying in the ThreadManager.


2. I think we would still need an isLeased() method which is basically
just a convenience for any task which is about to try and obtain a lease
to see if it is already leased.

See my comment above.


3. I also think that we want to stick with having these methods take in
RollerTask objects rather than individualized parameters.  The
RollerTask class already forces subclasses to define a name and a lease
time, so those are already included, and it should be easy enough to
just add in an id attribute which would just need to be unique to each
node of the cluster somehow.

I'm sort of ok with using a POJO instead of parameters. Read query
below for counter argument.

One thing I forgot was that we need to make RollerTask more abstract
since your implementation assumes a Task wants to lock/unlock on every
run. I believe we need a really abstract RollerTask and have two or
three kinds of tasks: register/unregister lease in single run,
register/renew lease on every run (master indexer scenario), no-op
(e.g. independent task).

That's fine with me, but what's the no-op scenario? I would think that all tasks should go through the leasing process in some form or another.




I think the big changes which you didn't really detail in this email is
that we would be shifting more of the logic around dealing with time
from app time to database time to make sure that all nodes are properly
synced.  Then we would also want to ensure the leasing process was rigid
enough to prevent any possible race conditions, which I believe at the
moment is not the case.  Accomplishing those two things definitely sound
like a good idea to me.

Let me take a quick stab at it.

0. We need a constraint on task name so we can't have duplicate rows.

agreed.



registerLease()

1. try to update the table if the lease is expired or we already have this lease

update task_lease set name=name, id=id, duration=duration,
starttime=CURRENT_TIMESTAMP where name = name and starttime + duration
< CURRENT_TIMESTAMP or id = id

if it returns 1 row updated, we are done.
else if return 0 rows updated, we need to insert a new row, if
success, we obtained the lock, else we return false, didn't register
lease.

looks good to me.



unregisterLease() is easy, just remove the row, if name and id match.

i'm not sure about deleting the row. i think we need to keep the rows so that we can maintain the last run time for a task, even if it was unleased. the last run time is essential to ensuring that all nodes are on a synchronized clock for the execution times of the tasks.

in fact, it may be ideal if there is some kind of initLease(RollerTask) method which can be used to initialize the row in the database in an unleased state. this simplifies the logic a bit as well because then we only have to do the update statement to acquire the lock, not insert logic.



renewLease() is basically just the update query above. if returns 1
row updated success, else failure.

Maybe the registerLease can be changed to first test for existence of
row with name = name. Anyways, this is roughly what I'm thinking and
if you notice, we don't make too much use of the POJO, so passing a
POJO seems too complicated for just a few parameters in one of the
methods.

overall it sounds good to me.

-- Allen



-Elias


-- Allen


>
> If you notice, I'm introducing a new property (id) that will be used to
> identify which node in the cluster has the lease in order that we can
> allow for renewals if they desire so. Currently, we only have support
> for task name and that's enough for what I'm trying to do. I'll explain
> more in my next email and if I'm brief on this one is because I must
> worry about dinner and putting the kids to bed.
>
> -Elias

Reply via email to