I am struggling with locking a file in AFS. Maybe there is something
basic I am doing wrong - I don't know. Below is a couple of code
fragments which show what I am doing.
The purpose of UpdateHistory is to append transaction text to a file.
There are multiple software servers running on multiple machines which
call this code to update the file. All the servers run under the same
userid, which has "all" AFS authority to the directory.
The problem is that occasionally the appended line is missing from the
file. This only seems to occur when there are multiple machines
involved. I do print out a message when lock collisions occur, and I do
see this event occur from time to time.
Not knowing any better, I suspect that the end of file is being
determined at "open" time. This would cause the closure on another
machine to write data which would subsequently be overwritten by a write
on the current machine. Because the close on the other machine could
occur in the window between the open and the lock, I would never know
about it.
The reason I suspect the open-lock window is the problem, is because I used to:
* open the file,
* wait for a successful lock,
* append the file,
* close the file.
With this algorithm I ran into the problem more frequently.
The code below implements the following algorithm
* while not locked
* open the file,
* attempt lock,
* if lock failure then close file
* append the file,
* close the file.
This seemed to resolve the problem, but on stressing it further the
problem still occurs.
If you could take the time to look at my code and educate me. Questions
that come to mind:
1) what is the proper way to perform file locking? I would have thought
that opening and locking a file needed to be an atomic operation. Why
is an atomic operation not needed?
2) Is there something unique I need to do in AFS to get a file lock?
3) When do the AIX functions I am compiling into my code get intercepted
so that AFS can handle them in their unique way? Am I using the right
functions? I have looked for the "flock" function, but it does not seem
to be in any of the AIX headers.
4) Maybe my locking algorithm is wrong - is there something I need to do
different.
Thanks for your help
Kris Davis ([EMAIL PROTECTED], krisd@rchland, krisd@rchvmw2)
Microcode Development Environment Dept213
1-507-253-3707 IBM Rochester, MN
-------------------------------------------------------------------------------
--------------------
#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mode.h>
#include <string.h>
#include <sys/file.h>
#include <sys/lockf.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include "servrpc.h"
#include "local.h"
#include "utils.h"
#include "errmsgs.h"
#define LOCK_MAX_RETRIES 5
int UpdateHistory(char *histfid, /* history file path */
char *transaction, /* operation comment */
char *partname, /* part being operated on */
char *user, /* client user id */
char *reason, /* reason number */
char *datetimeyear, /* date of operation */
char *status) /* idss status */
{
char hist_strg[MAX_INPUT_LINE];
int fnum;
int rc = SUCCESSFUL;
int close_rc;
fprintf(stderr, "*** Update History ***\n");
/*----------------------------------------------------------------*/
/* Put a lock on the file so no one else can update it until we */
/* are finished with this update. NOTE: Don't use stdio routines */
/* on locked files - buffered files give unpredictable results. */
/*----------------------------------------------------------------*/
fnum = WriteLock(histfid, O_WRONLY | O_CREAT | O_APPEND,
S_IREAD | S_IWRITE | S_IRGRP | S_IWGRP | S_IWOTH |
S_IROTH, LOCK_MAX_RETRIES);
if (fnum < 0)
{
rc = HX_UPDATE_PROB;
}
else
{
sprintf(hist_strg,
"%-13.12s %-20.19s %-15.14s %-13.12s %-26.25s %s\n",
transaction, partname, user, reason, datetimeyear, status);
if (AutoWrite(fnum, hist_strg, strlen(hist_strg)) < 0)
{
perror("ERROR(UpdateHistory)");
fprintf(stderr, "ERROR(UpdateHistory): "
"Unable to write to %s!\n", histfid);
rc = HX_UPDATE_PROB;
}
close_rc = close(fnum);
if (close_rc != SUCCESSFUL)
{
perror("ERROR(UpdateHistory)");
fprintf(stderr, "ERROR(UpdateHistory): "
"Unable to close %s!\n", histfid);
rc = HX_UPDATE_PROB;
}
}
return (rc);
}
-------------------------------------------------------------------------------
--------------------
#include <stdio.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/mode.h>
#include <string.h>
#include <sys/file.h>
#include <sys/lockf.h>
#include <sys/stat.h>
#include <sys/wait.h>
#include "local.h"
#include "utils.h"
#define LOCK_PAUSE_TIME 1
/* write with recovery from interrupt errors */
int AutoWrite (int fnum, char *buffer, int chars_left)
{
int total_chars_written;
int chars_written;
/* write string out and loop to recover from interrupt errors */
total_chars_written = 0;
while ((chars_left > 0) &&
((chars_written = write(fnum,
buffer + total_chars_written,
chars_left)) > 0))
{
chars_left = chars_left - chars_written;
total_chars_written = total_chars_written + chars_written;
}
return(chars_written);
}
/* establish write lock */
int WriteLock(char *filepath, /* file path */
int oflags,
int mode,
int retries) /* 0 = infinite, > 0 retries */
/* returns filenum or -1 */
{
int fnum;
int retry_count;
int rc = SUCCESSFUL;
/*----------------------------------------------------------------*/
/* Put a lock on the file so no one else can update it until we */
/* are finished with this update. F_TLOCK return immediately with */
/* failure if unable to establish lock. If initially unable to */
/* establish a file lock, pause then retry lock a few times. */
/*----------------------------------------------------------------*/
retry_count = retries;
while ((retry_count > 0) || retries == 0)
{
/*--------------------------------------------------------------*/
/* Determine if the specified exists. NOTE: Don't use stdio */
/* routines on locked files - buffered files give unpredictable */
/* results. */
/*--------------------------------------------------------------*/
fnum = open(filepath, oflags, mode);
if (fnum < 0)
{
perror("ERROR");
fprintf(stderr, "ERROR: Failed to open file: \"%s\".\n",
filepath);
return (FAILURE);
}
rc = lockf(fnum, F_TLOCK, 0);
if (rc == SUCCESSFUL)
{
break;
}
else
{
if (errno == EACCES)
{
fprintf(stderr, "%s locked...Retrying...\n", filepath);
}
else
{
perror("WARNING");
fprintf(stderr, "WARNING: %s lock fail...Retrying...\n",
filepath);
}
close(fnum);
retry_count--;
sleep(LOCK_PAUSE_TIME);
}
}
if ((retry_count == 0) && (retries > 0))
{
perror("ERROR(WriteLock)");
fprintf(stderr, "ERROR(WriteLock): "
"Unable to lock file %s!\n", filepath);
close(fnum);
return(FAILURE);
}
return(fnum);
}