Hi > | On Jun 18, Igshaan Mesias <[EMAIL PROTECTED]> wrote: > | > | > Description : The halockrun program provides a simple way to > | > implement locking in shell scripts. > | > | What does this offer over lockfile (procmail package) or dotlockfile > | (liblockfile1 package)? Aren't those used particularly for mailbox locking? The hatools package provides a wider application set for locking files.
> Or even flock(1), which is part of util-linux. The major benefit halockrun has over these tools is that you can't have stale lock files with its implementation. It may seems as though a lot of the parameters for procmails lockfile are just different timeouts to properly handle stale lockfiles. Since the concept of hatimerun comes from the high-availabity context, this aspect is very important. The idea (and also the name) of halockrun as well as for hatimerun comes from the SUN Cluster high availability product, the implementation also takes reliability very serious. If you look at dotlockfile manpage, it SEEMS to implement it in a similar way, but i there's not many timeout parameters, but still seems to rely on the existence of files, while halockrun actually lock's the files by kernel functions. halockrun also works on NFS shares if lockd is running. The node which hosts the halockrun instance which holds the lock will also take care not to stale the lock (the kernel again, not the user space). without having done any deeper checks, ITUMO more robust that dotlockrun. To point out again I think the strength of halockrun is in its implementation is that the lock-cleanup is done by the kernel on process end (no matter how the process ended, it might have had a core-dump) and not by a user-space procedure. This makes stale locks impossible. It might appear that both (lockfile and dotlockfile) are aimd for short living locks the dotlockfile manpage does not mention anything how it handles stale locks, but the lockfile_create(3) manpage does. it says that it might consider a lockfile beeing stale after five minutes. i did not see how dotlockfile would allow a longer timeout, nor do i know if dotlockfile uses lockfile_touch() as described in the lockfile_create manpage. halockrun was implemented in need to prevent multiple cronjobs running concurrently. e.g. we had a cronjobs which runns ever 5 minutes, and _usually_ finishes in less then 5 minutes. if now, we had two jobs running, doing the same, and each was causing the other be become slower (more resources required). This again has decreased the chance that the jobs finish until the next instance will be started by cron. This was the first application of halockrun. i know some cases where this is even used for longer running cron-jobs. I unsure whether lockfile or dotlockfile are suitable tools to use for longer running processes. That might not complete within an hour. Further are people using changed start/stop scripts for server processes like apache or mysqld which are based on halockrun. They start the process with halockrun, and can check if it still running with halockrun -t. if they want to send a signal to that process they can also use halockrun -t to obtain the pid and send a signal (e.g. to stop it again). Again, this implementation is stale-aware and the lock remains valid as long as the process is running. IMHO, the applications of halockrun are also wider then those of the other two tools mentioned. Also, lockfile and dotlockfile do not have funtionalilty of hatimerun. hatimerun was initially required for the cron-job problem. So, If the job which runs every 5 minutes might take sometimes up to 10 minutes, there is definitely something wrong if it takes an hour. For that reason hatimerun was built to kill such processes. hatimerun has the abillity to send multiple signals, to first ask the process himself to quit, or later kill it forcefully. In an environment with countless cronjobs, where sometimes some job just hangs, the hatimerun can make an automatic recover possible. heh, The importance of hatimerun comes with the fact that halockrun's locks are not considered stale as long as the process is not running. Therefore a hanging process could block everything, so that a reliable timeout is a must. lockfile & dotlockfile also seem to implement the concept of timeouts, this might reduce the need for a similar mechanism. howerver, halockrun & hatimerun together make also sure the the process which belongs to a stale lockfile is killed (cleaned up) so that other resources occupied by this process are also freed. -- Regards Igshaan Mesias -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]