On Fri, Jan 15, 2010 at 12:46:44AM -0600, Shawn Walker wrote:
> On 01/14/10 05:37 PM, Shawn Walker wrote:
> >On 01/13/10 08:40 PM, Shawn Walker wrote:
> >>On 01/13/10 06:42 PM, Shawn Walker wrote:
> >>>On 01/13/10 02:07 PM, Shawn Walker wrote:
> >>>>Greetings,
> >>>>
> >>>>The following webrev contains fixes for the following issue:
> >>>>
> >>>>1668 image locking needed to prevent race conditions
> >>>
> >>>1st webrev:
> >>>http://cr.opensolaris.org/~swalker/pkg-lock/
> >>>
> >>>new webrev (diff against 1st):
> >>>http://cr.opensolaris.org/~swalker/pkg-lock-2/
> >>
> >>new webrev (diff against 2nd):
> >>http://cr.opensolaris.org/~swalker/pkg-lock-3/
> >
> >new webrev (last i hope?) (diff against 3rd):
> >http://cr.opensolaris.org/~swalker/pkg-lock-4/
>
> new webrev (diff against 4th):
> http://cr.opensolaris.org/~swalker/pkg-lock-5/
>
> full webrev (complete set of changes):
> http://cr.opensolaris.org/~swalker/pkg-lock-5-full/
>
> Change Summary:
> * changed unlock to truncate lock file instead of removing since the
> lock file itself is only used so that it has something to use
> lockf() on and to store information about the caller who acquired
> the lock
>
> * added iso-8601 utc timestamp to lock file to aid debugging in
> trying to determine how long a lock has been held
>
> * updated unit tests to account for truncation/size checks
>

src/modules/client/api_errors.py

- i'm not a python expert, so perhaps this is ok, but all the __init__
  parameters to ImageLockedError default to None, and when returning
  an error string you only check if pid and pid_name are None.
  shouldn't you also check if hostname is None?

src/modules/client/image.py

- you're privous comments regarding Image.__lock indicate that it is a
  debug mechanism used to prevent recursive lock acquisition within a
  process.  if my understanding is correct, the following code is
  incorrect:

        # First, attempt to obtain a thread lock.
        if not self.__lock.acquire(blocking=blocking):
                raise api_errors.ImageLockedError()

  with this code you will:
        - fail if a thread tries to lock an image twice.  good.
        - if another thread has already locked the image and
          blocking is true, you will hang.  bad.
        - if another thread has already locked the image and
          blocking is false, you will return ImageLockedError.
          also bad.

  perhaps the following would be better:

        # sanity check, detect multiple image lock attempts
        if not self.__lock.acquire(blocking=False):
                raise AssertionError "oops..."

- could you file a bug on the lack of read operation locking?

  i think that the concerns that danek raised could be addressed by
  having a revokable read lock.

  for example, a client wanting to aquire a read operation could do:
        - a non blocking open of the fifo: /var/pkg/lock_read
        - aquire a read lock on the fifo
        - create a thread that does a read from the fifo and blocks
        - if this blocked thread ever wakes up, it should kill the
          current process with an error indicating that the lock was
          lost.

  if someone wants to aquire a write lock, they aquire the write lock as
  you've written it today.  then they do:
        - create a new read lock fifo /var/pkg/lock_read.$$
        - open it and get a write lock on it.
        - open the old read lock fifo.
        - move the new read lock fifo over the old one.
        - write a byte of data to the old fifo fd and close it.

  this is a seat of the pants design idea, it may have problems.
  but you get the general idea.
_______________________________________________
pkg-discuss mailing list
[email protected]
http://mail.opensolaris.org/mailman/listinfo/pkg-discuss

Reply via email to