Re: [PATCH 5/5 linux-next V2 (resend)] cifs: Combine set_file_size by handle and path into one function

2013-12-09 Thread Jeff Layton
On Sun,  8 Dec 2013 14:13:37 -0700
Tim Gardner t...@tpi.com wrote:

 As suggested by Jeff Layton:
 
 
 Now that I look though, it's clear to me that cifs_set_file_size is
 just wrong. Currently it calls ops-set_path_size and if that fails
 with certain error codes it tries to do a SMBLegacyOpen, etc. That's
 going to fall on its face if this is a SMB2/3 connection.
 
 Here's what should be done instead, I think:
 
 - get rid of both set_path_size and set_file_size operations. The
   version specific abstraction was implemented at the wrong level, IMO.
 
 - implement a new protocol level operation that basically takes the
   same args as cifs_set_file_size. cifs_setattr_unix/_nounix should
   call this operation.
 
 - the set_path_size operation for SMB1 should basically be the function
   that is cifs_set_file_size today (though that should probably be
   renamed). That function should be restructured to do the following:
 
   + look for an open filehandle
   + if there isn't one, open the file
   + call CIFSSMBSetFileSize
   + fall back to zero-length write if that fails
   + close the file if we opened it
 
 
 This patch also moves all of the SMBv1 legacy hacks from cifs_set_file_size() 
 into
 the SMBv1 specific handler smb_set_file_size() more or less intact. SMBv2 and 
 higher are
 more regular and orthogonal, so the v2 handler smb2_set_file_size() no longer 
 contains
 the legacy hacks.
 
 Cc: Steve French sfre...@samba.org
 Cc: Jeff Layton jlay...@redhat.com
 Cc: Dean Gehnert de...@tpi.com
 Signed-off-by: Tim Gardner t...@tpi.com
 ---
 
 V2 - this is a new patch in the V2 series.
 
 I know this is kind of a giant patch, but there really doesn't seem to be any
 intermediate steps. Its pretty much all or nothing.
 
 I've only tested against a Linux CIFS server running a 3.2 kernel since I no 
 longer
 have easy access to smbv2 servers (iOS 10.8 or Windows 7).
 
 resent with corrected Cc's
 
 rtg
 
  fs/cifs/cifsglob.h |9 ++---
  fs/cifs/inode.c|  108 +--
  fs/cifs/smb1ops.c  |  109 
 +++-
  fs/cifs/smb2ops.c  |   57 ---
  4 files changed, 170 insertions(+), 113 deletions(-)
 
 diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
 index 8fd8eb2..6113ce8 100644
 --- a/fs/cifs/cifsglob.h
 +++ b/fs/cifs/cifsglob.h
 @@ -269,12 +269,9 @@ struct smb_version_operations {
   int (*get_srv_inum)(const unsigned int, struct cifs_tcon *,
   struct cifs_sb_info *, const char *,
   u64 *uniqueid, FILE_ALL_INFO *);
 - /* set size by path */
 - int (*set_file_size_by_path)(const unsigned int, struct cifs_tcon *,
 -  const char *, __u64, struct cifs_sb_info *, bool);
 - /* set size by file handle */
 - int (*set_file_size_by_handle)(const unsigned int, struct cifs_tcon *,
 -  struct cifsFileInfo *, __u64, bool);
 + /* set file size */
 + int (*set_file_size)(struct inode *inode, struct iattr *attrs,
 +  unsigned int xid, char *full_path);
   /* set attributes */
   int (*set_file_info)(struct inode *, const char *, FILE_BASIC_INFO *,
const unsigned int);
 diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
 index e332038..3edeafd 100644
 --- a/fs/cifs/inode.c
 +++ b/fs/cifs/inode.c
 @@ -1899,118 +1899,28 @@ static void cifs_setsize(struct inode *inode, loff_t 
 offset)
   truncate_pagecache(inode, offset);
  }
  
 -/*
 - * Legacy hack to set file length.
 - */
 -static int
 -cifs_legacy_set_file_size(unsigned int xid, __u16 netfid, __u32 pid,
 -   unsigned int length, struct cifs_tcon *tcon,
 -   char *full_path)
 -{
 - int rc;
 - unsigned int bytes_written;
 - struct cifs_io_parms io_parms;
 -
 - io_parms.netfid = netfid;
 - io_parms.pid = pid;
 - io_parms.tcon = tcon;
 - io_parms.offset = 0;
 - io_parms.length = length;
 - rc = CIFSSMBWrite(xid, io_parms, bytes_written,
 -   NULL, NULL, 1);
 - cifs_dbg(FYI, %s len %d rc %d on %s\n, __func__, length, rc,
 -  full_path);
 - return rc;
 -}
 -
  static int
  cifs_set_file_size(struct inode *inode, struct iattr *attrs,
  unsigned int xid, char *full_path)
  {
 - int rc;
 - struct cifsFileInfo *open_file;
 + int rc = -ENOSYS;
   struct cifsInodeInfo *cifsInode = CIFS_I(inode);
   struct cifs_sb_info *cifs_sb = CIFS_SB(inode-i_sb);
   struct tcon_link *tlink = NULL;
   struct cifs_tcon *tcon = NULL;
   struct TCP_Server_Info *server;
  
 - /*
 -  * To avoid spurious oplock breaks from server, in the case of
 -  * inodes that we already have open, avoid doing path based
 -  * setting of file size if we can do it by handle.
 -  * This keeps our caching token (oplock

Re: [Nfs-ganesha-devel] [RFC PATCH v2 3/5] locks: add new private lock type that is owned by the filp

2013-12-09 Thread Jeff Layton
On Wed, 20 Nov 2013 14:50:32 -0500
J. Bruce Fields bfie...@fieldses.org wrote:

 On Wed, Nov 20, 2013 at 11:45:04AM -0500, Jeff Layton wrote:
  Due to some unfortunate history, POSIX locks have very strange and
  unhelpful semantics. The thing that usually catches people by surprise
  is that they are dropped whenever the process closes any file descriptor
  associated with the inode.
  
  This is extremely problematic for people developing file servers that
  need to implement byte-range locks. Developers often need a lock
  management facility to ensure that file descriptors are not closed
  until all of the locks associated with the inode are finished.
  
  This patchset adds a new type of lock that attempts to address this
  issue. These locks work just like normal POSIX read/write locks, but
  have semantics that are more like BSD locks with respect to inheritance
  and behavior on close.
  
  This is implemented primarily by changing how fl_owner field is set for
  these locks. Instead of having them owned by the files_struct of the
  process, they are instead owned by the filp on which they were acquired.
  Thus, they are inherited across fork() and are only released when the
  last reference to a filp is put.
  
  These new semantics prevent them from being merged with classic POSIX
  locks, even if they are acquired by the same process. These locks will
  also conflict with classic POSIX locks even if they are acquired by
  the same process or on the same file descriptor.
  
  Signed-off-by: Jeff Layton jlay...@redhat.com
  ---
   fs/locks.c   | 22 +-
   include/uapi/asm-generic/fcntl.h | 15 +++
   2 files changed, 36 insertions(+), 1 deletion(-)
  
  diff --git a/fs/locks.c b/fs/locks.c
  index 86cafc3..3b278a6 100644
  --- a/fs/locks.c
  +++ b/fs/locks.c
  @@ -348,6 +348,26 @@ static int posix_assign_type(struct file_lock *fl, 
  long type)
   {
  int err;
   
  +   /*
  +* FL_FILP_PRIVATE locks are owned by the filp upon which they were
  +* acquired, regardless of what task is dealing with them. Set the
  +* fl_owner appropriately.
  +*/
  +   switch (type) {
  +   case F_RDLCKP:
  +   type = F_RDLCK;
  +   fl-fl_owner = (fl_owner_t)fl-fl_file;
  +   break;
  +   case F_WRLCKP:
  +   type = F_WRLCK;
  +   fl-fl_owner = (fl_owner_t)fl-fl_file;
  +   break;
  +   case F_UNLCKP:
  +   type = F_UNLCK;
  +   fl-fl_owner = (fl_owner_t)fl-fl_file;
  +   break;
  +   }
  +
 
 After this fl_owner gets set to current-files in
 flock{64}_to_posix_lock and then reset here.  That seems like a trap for
 the unwary reader.
 
 Could you do something like rename this flock_to_posix_lock_common and
 move all the 32/64-bit-independent initialization here?
 
 Looks like there's way more duplication than necessary between those two
 cases.
 

Ok, finally got a chance to start looking at this again...

I agree that at first glance, it looks like there is a lot of
duplication between flock_to_posix_lock and flock64_to_posix_lock, but
the problem we have is that we need to initialize the file_locks with
info from two different types of structs that were passed in from
userland (flock and flock64).

I don't see a clean way to consolidate the two given that...

 (Also, why do we have an fl_owner_t instead of using a void?)
 
 --b.
 
  err = assign_type(fl, type);
  if (err)
  return err;
  @@ -2225,7 +2245,7 @@ void locks_remove_filp(struct file *filp)
   
  while ((fl = *before) != NULL) {
  if (fl-fl_file == filp) {
  -   if (IS_FLOCK(fl)) {
  +   if (IS_FLOCK(fl) || IS_POSIX(fl)) {
  locks_delete_lock(before);
  continue;
  }
  diff --git a/include/uapi/asm-generic/fcntl.h 
  b/include/uapi/asm-generic/fcntl.h
  index 95e46c8..6b7b68a 100644
  --- a/include/uapi/asm-generic/fcntl.h
  +++ b/include/uapi/asm-generic/fcntl.h
  @@ -151,6 +151,21 @@ struct f_owner_ex {
   #define F_UNLCK2
   #endif
   
  +/*
  + * fd private POSIX locks.
  + *
  + * Usually POSIX locks held by a process are released on *any* close and 
  are
  + * not inherited across a fork().
  + *
  + * These lock types will conflict with normal POSIX locks, but are owned
  + * by the fd, not the process. This means that they are inherited across
  + * fork() like BSD (flock) locks, and they are only closed when the last
  + * reference to the the filp against which were acquired is closed.
  + */
  +#define F_RDLCKP   5
  +#define F_WRLCKP   6
  +#define F_UNLCKP   7
  +
   /* for old implementation of bsd flock () */
   #ifndef F_EXLCK
   #define F_EXLCK4   /* or 3 */
  -- 
  1.8.3.1
  
  --
  To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
  the body of a message to majord...@vger.kernel.org
  More majordomo

Re: [PATCH 4/4 linux-next V3] cifs: Combine set_file_size by handle and path into one function

2013-12-09 Thread Jeff Layton
On Mon,  9 Dec 2013 12:41:56 -0700
Tim Gardner t...@tpi.com wrote:

 As suggested by Jeff Layton:
 
 
 Now that I look though, it's clear to me that cifs_set_file_size is
 just wrong. Currently it calls ops-set_path_size and if that fails
 with certain error codes it tries to do a SMBLegacyOpen, etc. That's
 going to fall on its face if this is a SMB2/3 connection.
 
 Here's what should be done instead, I think:
 
 - get rid of both set_path_size and set_file_size operations. The
   version specific abstraction was implemented at the wrong level, IMO.
 
 - implement a new protocol level operation that basically takes the
   same args as cifs_set_file_size. cifs_setattr_unix/_nounix should
   call this operation.
 
 - the set_path_size operation for SMB1 should basically be the function
   that is cifs_set_file_size today (though that should probably be
   renamed). That function should be restructured to do the following:
 
   + look for an open filehandle
   + if there isn't one, open the file
   + call CIFSSMBSetFileSize
   + fall back to zero-length write if that fails
   + close the file if we opened it
 
 
 This patch also moves all of the SMBv1 legacy hacks from cifs_set_file_size() 
 into
 the SMBv1 specific handler smb_set_file_size() more or less intact. SMBv2 and 
 higher are
 more regular and orthogonal, so the v2 handler smb2_set_file_size() no longer 
 contains
 the legacy hacks.
 
 Move cifsFileInfo_put(open_file) below the first call to
 cifs_legacy_set_file_size() so that the reference on open_file stays valid
 throughout.
 
 Cc: Steve French sfre...@samba.org
 Cc: Jeff Layton jlay...@redhat.com
 Cc: Dean Gehnert de...@tpi.com
 Signed-off-by: Tim Gardner t...@tpi.com
 ---
 
 V3 - dropped 'cifs: fix incorrect reference count check' which impacted this 
 patch.
 Also moved cifsFileInfo_put(open_file) according to V2 review comments from 
 Jeff Layton.
 
 V2 - this is a new patch in the V2 series.
 
 I know this is kind of a giant patch, but there really doesn't seem to be any
 intermediate steps. Its pretty much all or nothing.
 
 I've only tested against a Linux CIFS server running a 3.2 kernel since I no 
 longer
 have easy access to smbv2 servers (iOS 10.8 or Windows 7).
 
  fs/cifs/cifsglob.h |9 ++---
  fs/cifs/inode.c|  108 +--
  fs/cifs/smb1ops.c  |  109 
 +++-
  fs/cifs/smb2ops.c  |   57 ---
  4 files changed, 170 insertions(+), 113 deletions(-)
 
 diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
 index 8fd8eb2..6113ce8 100644
 --- a/fs/cifs/cifsglob.h
 +++ b/fs/cifs/cifsglob.h
 @@ -269,12 +269,9 @@ struct smb_version_operations {
   int (*get_srv_inum)(const unsigned int, struct cifs_tcon *,
   struct cifs_sb_info *, const char *,
   u64 *uniqueid, FILE_ALL_INFO *);
 - /* set size by path */
 - int (*set_file_size_by_path)(const unsigned int, struct cifs_tcon *,
 -  const char *, __u64, struct cifs_sb_info *, bool);
 - /* set size by file handle */
 - int (*set_file_size_by_handle)(const unsigned int, struct cifs_tcon *,
 -  struct cifsFileInfo *, __u64, bool);
 + /* set file size */
 + int (*set_file_size)(struct inode *inode, struct iattr *attrs,
 +  unsigned int xid, char *full_path);
   /* set attributes */
   int (*set_file_info)(struct inode *, const char *, FILE_BASIC_INFO *,
const unsigned int);
 diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
 index 3f710c6..3edeafd 100644
 --- a/fs/cifs/inode.c
 +++ b/fs/cifs/inode.c
 @@ -1899,118 +1899,28 @@ static void cifs_setsize(struct inode *inode, loff_t 
 offset)
   truncate_pagecache(inode, offset);
  }
  
 -/*
 - * Legacy hack to set file length.
 - */
 -static int
 -cifs_legacy_set_file_size(unsigned int xid, __u16 netfid, __u32 pid,
 -   unsigned int length, struct cifs_tcon *tcon,
 -   char *full_path)
 -{
 - int rc;
 - unsigned int bytes_written;
 - struct cifs_io_parms io_parms;
 -
 - io_parms.netfid = netfid;
 - io_parms.pid = pid;
 - io_parms.tcon = tcon;
 - io_parms.offset = 0;
 - io_parms.length = length;
 - rc = CIFSSMBWrite(xid, io_parms, bytes_written,
 -   NULL, NULL, 1);
 - cifs_dbg(FYI, %s len %d rc %d on %s\n, __func__, length, rc,
 -  full_path);
 - return rc;
 -}
 -
  static int
  cifs_set_file_size(struct inode *inode, struct iattr *attrs,
  unsigned int xid, char *full_path)
  {
 - int rc;
 - struct cifsFileInfo *open_file;
 + int rc = -ENOSYS;
   struct cifsInodeInfo *cifsInode = CIFS_I(inode);
   struct cifs_sb_info *cifs_sb = CIFS_SB(inode-i_sb);
   struct tcon_link *tlink = NULL;
   struct cifs_tcon *tcon = NULL

Re: [PATCH 1/9] vfs: check submounts and drop atomically

2013-08-12 Thread Jeff Layton
 to has_unlinked_ancestor().
 + * Returns 0 if successfully unhashed @parent.  If there were submounts then
 + * return -EBUSY.
 + *
 + * @dentry: dentry to prune and drop
 + */
 +int check_submounts_and_drop(struct dentry *dentry)
 +{
 + int ret = 0;
 +
 + /* Negative dentries can be dropped without further checks */
 + if (!dentry-d_inode) {
 + d_drop(dentry);
 + goto out;
 + }
 +
 + spin_lock(dentry-d_lock);
 + if (d_unhashed(dentry))
 + goto out_unlock;
 + if (list_empty(dentry-d_subdirs)) {
 + if (d_mountpoint(dentry)) {
 + ret = -EBUSY;
 + goto out_unlock;
 + }
 + __d_drop(dentry);
 + goto out_unlock;
 + }
 + spin_unlock(dentry-d_lock);
 +
 + for (;;) {
 + LIST_HEAD(dispose);
 + ret = __check_submounts_and_drop(dentry, dispose);
 + if (!list_empty(dispose))
 + shrink_dentry_list(dispose);
 +
 + if (ret = 0)
 + break;
 +
 + cond_resched();
 + }
 +
 +out:
 + return ret;
 +
 +out_unlock:
 + spin_unlock(dentry-d_lock);
 + goto out;
 +}
 +EXPORT_SYMBOL(check_submounts_and_drop);
 +
  /**
   * __d_alloc -   allocate a dcache entry
   * @sb: filesystem it will belong to
 diff --git a/include/linux/dcache.h b/include/linux/dcache.h
 index b90337c..41b21ca 100644
 --- a/include/linux/dcache.h
 +++ b/include/linux/dcache.h
 @@ -251,6 +251,7 @@ extern void d_prune_aliases(struct inode *);
  
  /* test whether we have any submounts in a subdir tree */
  extern int have_submounts(struct dentry *);
 +extern int check_submounts_and_drop(struct dentry *);
  
  /*
   * This adds the entry to the hash queues.

Nice work...

Looks reasonable overall. The only (extremely minor) nit I have is that
it might be nice to replace some of the direct struct list_head
accesses with helper macros.

Acked-by: Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/9] vfs: check unlinked ancestors before mount

2013-08-12 Thread Jeff Layton
On Thu,  8 Aug 2013 17:24:43 +0200
Miklos Szeredi mik...@szeredi.hu wrote:

 From: Miklos Szeredi mszer...@suse.cz
 
 We check submounts before doing d_drop() on a non-empty directory dentry in
 NFS (have_submounts()), but we do not exclude a racing mount.  Nor do we
 prevent mounts to be added to the disconnected subtree using relative paths
 after the d_drop().
 
 This patch fixes these issues by checking for unlinked (unhashed, non-root)
 ancestors before proceeding with the mount.  This is done after setting
 DCACHE_MOUNTED on the soon-to-be mountpoint and with the rename seqlock
 taken for write.  This ensures that the only one of
 check_submounts_and_drop() or has_unlinked_ancestor() can succeed.
 
 Signed-off-by: Miklos Szeredi mszer...@suse.cz
 CC: David Howells dhowe...@redhat.com
 CC: Steven Whitehouse swhit...@redhat.com
 CC: Trond Myklebust trond.mykleb...@netapp.com
 CC: Greg Kroah-Hartman gre...@linuxfoundation.org
 ---
  fs/dcache.c| 35 +++
  fs/internal.h  |  1 +
  fs/namespace.c |  9 +
  3 files changed, 45 insertions(+)
 
 diff --git a/fs/dcache.c b/fs/dcache.c
 index 020004d..1333445 100644
 --- a/fs/dcache.c
 +++ b/fs/dcache.c
 @@ -1103,6 +1103,41 @@ rename_retry:
  }
  EXPORT_SYMBOL(have_submounts);
  
 +static bool __has_unlinked_ancestor(struct dentry *dentry)
 +{
 + struct dentry *this;
 +
 + for (this = dentry; !IS_ROOT(this); this = this-d_parent) {
 + int is_unhashed;
 +
 + /* Need exclusion wrt. check_submounts_and_drop() */
 + spin_lock(this-d_lock);
 + is_unhashed = d_unhashed(this);
 + spin_unlock(this-d_lock);
 +
 + if (is_unhashed)
 + return true;
 + }
 + return false;
 +}
 +
 +/*
 + * Called by mount code to check if the mountpoint is reachable (e.g. NFS can
 + * unhash a directory dentry and then the complete subtree can become
 + * unreachable).
 + */
 +bool has_unlinked_ancestor(struct dentry *dentry)
 +{
 + bool found;
 +
 + /* Need exclusion wrt. check_submounts_and_drop() */
 + write_seqlock(rename_lock);
 + found = __has_unlinked_ancestor(dentry);
 + write_sequnlock(rename_lock);
 +
 + return found;
 +}
 +
  /*
   * Search the dentry child list of the specified parent,
   * and move any unused dentries to the end of the unused
 diff --git a/fs/internal.h b/fs/internal.h
 index 7c5f01c..d232355 100644
 --- a/fs/internal.h
 +++ b/fs/internal.h
 @@ -126,6 +126,7 @@ extern int invalidate_inodes(struct super_block *, bool);
   * dcache.c
   */
  extern struct dentry *__d_alloc(struct super_block *, const struct qstr *);
 +extern bool has_unlinked_ancestor(struct dentry *dentry);
  
  /*
   * read_write.c
 diff --git a/fs/namespace.c b/fs/namespace.c
 index 7b1ca9b..bb92a9c 100644
 --- a/fs/namespace.c
 +++ b/fs/namespace.c
 @@ -634,6 +634,15 @@ static struct mountpoint *new_mountpoint(struct dentry 
 *dentry)
   }
   dentry-d_flags |= DCACHE_MOUNTED;
   spin_unlock(dentry-d_lock);
 +
 + if (has_unlinked_ancestor(dentry)) {
 + spin_lock(dentry-d_lock);
 + dentry-d_flags = ~DCACHE_MOUNTED;
 + spin_unlock(dentry-d_lock);
 + kfree(mp);
 + return ERR_PTR(-ENOENT);
 + }
 +
   mp-m_dentry = dentry;
   mp-m_count = 1;
   list_add(mp-m_hash, chain);


Looks reasonable. For posterity, it might be worth mentioning the
potential regression that you described earlier (racing with rename
from another host). That way if someone hits it in the future they
might be able to zero in on this change as the culprit more easily.

Acked-by: Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 5/9] nfs: use check_submounts_and_drop()

2013-08-12 Thread Jeff Layton
On Thu,  8 Aug 2013 17:24:46 +0200
Miklos Szeredi mik...@szeredi.hu wrote:

 From: Miklos Szeredi mszer...@suse.cz
 
 Do have_submounts(), shrink_dcache_parent() and d_drop() atomically.
 
 check_submounts_and_drop() can deal with negative dentries and
 non-directories as well.
 
 Non-directories can also be mounted on.  And just like directories we don't
 want these to disappear with invalidation.
 
 Signed-off-by: Miklos Szeredi mszer...@suse.cz
 CC: Trond Myklebust trond.mykleb...@netapp.com
 ---
  fs/nfs/dir.c | 9 -
  1 file changed, 4 insertions(+), 5 deletions(-)
 
 diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
 index e474ca2b..7468735 100644
 --- a/fs/nfs/dir.c
 +++ b/fs/nfs/dir.c
 @@ -1135,14 +1135,13 @@ out_zap_parent:
   if (inode  S_ISDIR(inode-i_mode)) {
   /* Purge readdir caches. */
   nfs_zap_caches(inode);
 - /* If we have submounts, don't unhash ! */
 - if (have_submounts(dentry))
 - goto out_valid;
   if (dentry-d_flags  DCACHE_DISCONNECTED)
   goto out_valid;
 - shrink_dcache_parent(dentry);
   }
 - d_drop(dentry);
 + /* If we have submounts, don't unhash ! */
 + if (check_submounts_and_drop(dentry) != 0)
 + goto out_valid;
 +
   dput(parent);
   dfprintk(LOOKUPCACHE, NFS: %s(%s/%s) is invalid\n,
   __func__, dentry-d_parent-d_name.name,

Acked-by: Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Jeff Layton
On Wed, 26 Mar 2014 17:23:24 -0700
Andy Lutomirski l...@amacapital.net wrote:

 Hi various people who care about user-space NFS servers and/or
 security-relevant APIs.
 
 I propose the following set of new syscalls:
 
 int credfd_create(unsigned int flags): returns a new credfd that
 corresponds to current's creds.
 
 int credfd_activate(int fd, unsigned int flags): Change current's
 creds to match the creds stored in fd.  To be clear, this changes both
 the subjective and objective (aka real_cred and cred) because
 there aren't any real semantics for what happens when userspace code
 runs with real_cred != cred.
 
 Rules:
 
  - credfd_activate fails (-EINVAL) if fd is not a credfd.
  - credfd_activate fails (-EPERM) if the fd's userns doesn't match
 current's userns.  credfd_activate is not intended to be a substitute
 for setns.
  - credfd_activate will fail (-EPERM) if LSM does not allow the
 switch.  This probably needs to be a new selinux action --
 dyntransition is too restrictive.
 
 
 Optional:
  - credfd_create always sets cloexec, because the alternative is
 silly.
  - credfd_activate fails (-EINVAL) if dumpable.  This is because we
 don't want a privileged daemon to be ptraced while impersonating
 someone else.
  - optional: both credfd_create and credfd_activate fail if
 !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
 
 The first question: does this solve Ganesha's problem?
 
 The second question: is this safe?  I can see two major concerns.  The
 bigger concern is that having these syscalls available will allow
 users to exploit things that were previously secure.  For example,
 maybe some configuration assumes that a task running as uid==1 can't
 switch to uid==2, even with uid 2's consent.  Similar issues happen
 with capabilities.  If CAP_SYS_ADMIN is not required, then this is no
 longer really true.
 
 Alternatively, something running as uid == 0 with heavy capability
 restrictions in a mount namespace (but not a uid namespace) could pass
 a credfd out of the namespace.  This could break things like Docker
 pretty badly.  CAP_SYS_ADMIN guards against this to some extent.  But
 I think that Docker is already totally screwed if a Docker root task
 can receive an O_DIRECTORY or O_PATH fd out of the container, so it's
 not entirely clear that the situation is any worse, even without
 requiring CAP_SYS_ADMIN.
 
 The second concern is that it may be difficult to use this correctly.
 There's a reason that real_cred and cred exist, but it's not really
 well set up for being used.
 
 As a simple way to stay safe, Ganesha could only use credfds that have
 real_uid == 0.
 
 --Andy


I still don't quite grok why having this special credfd_create call
buys you anything over simply doing what Al had originally suggested --
switch creds using all of the different syscalls and then simply caching
that in a normal fd:

fd = open(/dev/null, O_PATH...);

...it seems to me that the credfd_activate call will still need to do
the same permission checking that all of the individual set*id() calls
require (and all of the other stuff like changing selinux contexts,
etc).

IOW, this fd is just a handle for passing around a struct cred, but I
don't see why having access to that handle would allow you to do
something you couldn't already do anyway.

Am I missing something obvious here?

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-26 Thread Jeff Layton
On Wed, 26 Mar 2014 20:05:16 -0700
Andy Lutomirski l...@amacapital.net wrote:

 On Wed, Mar 26, 2014 at 7:48 PM, Jeff Layton jlay...@redhat.com
 wrote:
  On Wed, 26 Mar 2014 17:23:24 -0700
  Andy Lutomirski l...@amacapital.net wrote:
 
  Hi various people who care about user-space NFS servers and/or
  security-relevant APIs.
 
  I propose the following set of new syscalls:
 
  int credfd_create(unsigned int flags): returns a new credfd that
  corresponds to current's creds.
 
  int credfd_activate(int fd, unsigned int flags): Change current's
  creds to match the creds stored in fd.  To be clear, this changes
  both the subjective and objective (aka real_cred and cred)
  because there aren't any real semantics for what happens when
  userspace code runs with real_cred != cred.
 
  Rules:
 
   - credfd_activate fails (-EINVAL) if fd is not a credfd.
   - credfd_activate fails (-EPERM) if the fd's userns doesn't match
  current's userns.  credfd_activate is not intended to be a
  substitute for setns.
   - credfd_activate will fail (-EPERM) if LSM does not allow the
  switch.  This probably needs to be a new selinux action --
  dyntransition is too restrictive.
 
 
  Optional:
   - credfd_create always sets cloexec, because the alternative is
  silly.
   - credfd_activate fails (-EINVAL) if dumpable.  This is because we
  don't want a privileged daemon to be ptraced while impersonating
  someone else.
   - optional: both credfd_create and credfd_activate fail if
  !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
 
  The first question: does this solve Ganesha's problem?
 
  The second question: is this safe?  I can see two major concerns.
  The bigger concern is that having these syscalls available will
  allow users to exploit things that were previously secure.  For
  example, maybe some configuration assumes that a task running as
  uid==1 can't switch to uid==2, even with uid 2's consent.  Similar
  issues happen with capabilities.  If CAP_SYS_ADMIN is not
  required, then this is no longer really true.
 
  Alternatively, something running as uid == 0 with heavy capability
  restrictions in a mount namespace (but not a uid namespace) could
  pass a credfd out of the namespace.  This could break things like
  Docker pretty badly.  CAP_SYS_ADMIN guards against this to some
  extent.  But I think that Docker is already totally screwed if a
  Docker root task can receive an O_DIRECTORY or O_PATH fd out of
  the container, so it's not entirely clear that the situation is
  any worse, even without requiring CAP_SYS_ADMIN.
 
  The second concern is that it may be difficult to use this
  correctly. There's a reason that real_cred and cred exist, but
  it's not really well set up for being used.
 
  As a simple way to stay safe, Ganesha could only use credfds that
  have real_uid == 0.
 
  --Andy
 
 
  I still don't quite grok why having this special credfd_create call
  buys you anything over simply doing what Al had originally
  suggested -- switch creds using all of the different syscalls and
  then simply caching that in a normal fd:
 
  fd = open(/dev/null, O_PATH...);
 
  ...it seems to me that the credfd_activate call will still need to
  do the same permission checking that all of the individual set*id()
  calls require (and all of the other stuff like changing selinux
  contexts, etc).
 
  IOW, this fd is just a handle for passing around a struct cred,
  but I don't see why having access to that handle would allow you to
  do something you couldn't already do anyway.
 
  Am I missing something obvious here?
 
 Not really.  I think I didn't adequately explain a piece of this.
 
 I think that what you're suggesting is for an fd to encode a set of
 credentials but not to grant permission to use those credentials.  So
 switch_creds(fd) is more or less the same thing as switch_creds(ruid,
 euid, suid, rgid, egid, sgid, groups, mac label, ...).  switch_creds
 needs to verify that the caller can dyntransition to the label, set
 all the ids, etc., but it avoids allocating anything and running RCU
 callbacks.
 
 The trouble with this is that the verification needed is complicated
 and expensive.  And I think that my proposal is potentially more
 useful.
 

Is it really though? My understanding of the problem was that it was
the syscall (context switching) overhead + having to do a bunch of RCU
critical stuff that was the problem. If we can do all of this in the
context of a single RCU critical section, isn't that still a win?

As to the complicated part...maybe but it doesn't seem like it would
have to be. We could simply return -EINVAL or something if the old
struct cred doesn't have fields that match the ones we're replacing and
that we don't expect to see changed.

 A credfd is like a struct cred, but possession of a credfd carries the
 permission to use those credentials.  So, for example, credfd_activate
 to switch to a given uid might work even if setresuid to that uid
 would be disallowed

Re: Thoughts on credential switching

2014-03-27 Thread Jeff Layton
On Thu, 27 Mar 2014 13:46:06 +0100
Florian Weimer fwei...@redhat.com wrote:

 On 03/27/2014 01:23 AM, Andy Lutomirski wrote:
 
  I propose the following set of new syscalls:
 
  int credfd_create(unsigned int flags): returns a new credfd that
  corresponds to current's creds.
 
  int credfd_activate(int fd, unsigned int flags): Change current's
  creds to match the creds stored in fd.  To be clear, this changes
  both the subjective and objective (aka real_cred and cred)
  because there aren't any real semantics for what happens when
  userspace code runs with real_cred != cred.
 
 This interface does not address the long-term lack of POSIX
 compliance in setuid and friends, which are required to be
 process-global and not thread-specific (as they are on the kernel
 side).
 
 glibc works around this by reserving a signal and running set*id on 
 every thread in a special signal handler.  This is just crass, and it
 is likely impossible to restore the original process state in case of 
 partial failure.  We really need kernel support to perform the 
 process-wide switch in an all-or-nothing manner.
 

I disagree. We're treading new ground here with this syscall. It's
not defined by POSIX so we're under no obligation to follow its silly
directives in this regard. Per-process cred switching doesn't really
make much sense in this day and age, IMO. Wasn't part of the spec was
written before threading existed


The per-process credential switching is pretty universally a pain in
the ass for anyone who wants to write something like a threaded file
server. Jeremy Allison had to jump through some rather major hoops to
work around it for Samba [1]. I think we want to strive to make this a
per-task credential switch and ignore that part of the POSIX spec.

That said, I think we will need to understand and document what we
expect to occur if someone does this new switch_creds(fd) call and then
subsequently calls something like setuid(), if only to ensure that we
don't get blindsided by it.

[1]: http://sourceware.org/ml/libc-help/2012-07/msg4.html
-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-27 Thread Jeff Layton
On Thu, 27 Mar 2014 14:06:32 +0100
Florian Weimer fwei...@redhat.com wrote:

 On 03/27/2014 02:02 PM, Jeff Layton wrote:
 
  This interface does not address the long-term lack of POSIX
  compliance in setuid and friends, which are required to be
  process-global and not thread-specific (as they are on the kernel
  side).
 
  glibc works around this by reserving a signal and running set*id on
  every thread in a special signal handler.  This is just crass, and
  it is likely impossible to restore the original process state in
  case of partial failure.  We really need kernel support to perform
  the process-wide switch in an all-or-nothing manner.
 
 
  I disagree. We're treading new ground here with this syscall. It's
  not defined by POSIX so we're under no obligation to follow its
  silly directives in this regard. Per-process cred switching doesn't
  really make much sense in this day and age, IMO. Wasn't part of the
  spec was written before threading existed
 
 Okay, then we need to add a separate set of system calls.
 
 I really, really want to get rid of that signal handler mess in
 glibc, with its partial failures.
 

I agree, it's a hack, but hardly anyone these days really wants to
switch creds on a per-process basis. It's just that we're saddled with
a spec for those calls that was written before threads really existed.

The kernel syscalls already do the right thing as far as I'm concerned.
What would be nice however is a blessed glibc interface to them
that didn't involve all of the signal handling stuff. Then samba et. al.
wouldn't need to call syscall() directly to get at them.

  The per-process credential switching is pretty universally a pain in
  the ass for anyone who wants to write something like a threaded file
  server. Jeremy Allison had to jump through some rather major hoops
  to work around it for Samba [1]. I think we want to strive to make
  this a per-task credential switch and ignore that part of the POSIX
  spec.
 
 Yeah, I get that, setfsuid/setfsgid already behaves this way.
 
 (Current directory and umask are equally problematic, but it's
 possible to avoid most issues.)
 
  That said, I think we will need to understand and document what we
  expect to occur if someone does this new switch_creds(fd) call and
  then subsequently calls something like setuid(), if only to ensure
  that we don't get blindsided by it.
 
 Currently, from the kernel perspective, this is not really a problem 
 because the credentials are always per-task.  It's just that a 
 conforming user space needs the process-wide credentials.
 

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-27 Thread Jeff Layton
On Wed, 26 Mar 2014 20:25:35 -0700
Jeff Layton jlay...@redhat.com wrote:

 On Wed, 26 Mar 2014 20:05:16 -0700
 Andy Lutomirski l...@amacapital.net wrote:
 
  On Wed, Mar 26, 2014 at 7:48 PM, Jeff Layton jlay...@redhat.com
  wrote:
   On Wed, 26 Mar 2014 17:23:24 -0700
   Andy Lutomirski l...@amacapital.net wrote:
  
   Hi various people who care about user-space NFS servers and/or
   security-relevant APIs.
  
   I propose the following set of new syscalls:
  
   int credfd_create(unsigned int flags): returns a new credfd that
   corresponds to current's creds.
  
   int credfd_activate(int fd, unsigned int flags): Change current's
   creds to match the creds stored in fd.  To be clear, this changes
   both the subjective and objective (aka real_cred and cred)
   because there aren't any real semantics for what happens when
   userspace code runs with real_cred != cred.
  
   Rules:
  
- credfd_activate fails (-EINVAL) if fd is not a credfd.
- credfd_activate fails (-EPERM) if the fd's userns doesn't
   match current's userns.  credfd_activate is not intended to be a
   substitute for setns.
- credfd_activate will fail (-EPERM) if LSM does not allow the
   switch.  This probably needs to be a new selinux action --
   dyntransition is too restrictive.
  
  
   Optional:
- credfd_create always sets cloexec, because the alternative is
   silly.
- credfd_activate fails (-EINVAL) if dumpable.  This is because
   we don't want a privileged daemon to be ptraced while
   impersonating someone else.
- optional: both credfd_create and credfd_activate fail if
   !ns_capable(CAP_SYS_ADMIN) or perhaps !capable(CAP_SETUID).
  
   The first question: does this solve Ganesha's problem?
  
   The second question: is this safe?  I can see two major concerns.
   The bigger concern is that having these syscalls available will
   allow users to exploit things that were previously secure.  For
   example, maybe some configuration assumes that a task running as
   uid==1 can't switch to uid==2, even with uid 2's consent.
   Similar issues happen with capabilities.  If CAP_SYS_ADMIN is not
   required, then this is no longer really true.
  
   Alternatively, something running as uid == 0 with heavy
   capability restrictions in a mount namespace (but not a uid
   namespace) could pass a credfd out of the namespace.  This could
   break things like Docker pretty badly.  CAP_SYS_ADMIN guards
   against this to some extent.  But I think that Docker is already
   totally screwed if a Docker root task can receive an O_DIRECTORY
   or O_PATH fd out of the container, so it's not entirely clear
   that the situation is any worse, even without requiring
   CAP_SYS_ADMIN.
  
   The second concern is that it may be difficult to use this
   correctly. There's a reason that real_cred and cred exist, but
   it's not really well set up for being used.
  
   As a simple way to stay safe, Ganesha could only use credfds that
   have real_uid == 0.
  
   --Andy
  
  
   I still don't quite grok why having this special credfd_create
   call buys you anything over simply doing what Al had originally
   suggested -- switch creds using all of the different syscalls and
   then simply caching that in a normal fd:
  
   fd = open(/dev/null, O_PATH...);
  
   ...it seems to me that the credfd_activate call will still need to
   do the same permission checking that all of the individual
   set*id() calls require (and all of the other stuff like changing
   selinux contexts, etc).
  
   IOW, this fd is just a handle for passing around a struct cred,
   but I don't see why having access to that handle would allow you
   to do something you couldn't already do anyway.
  
   Am I missing something obvious here?
  
  Not really.  I think I didn't adequately explain a piece of this.
  
  I think that what you're suggesting is for an fd to encode a set of
  credentials but not to grant permission to use those credentials.
  So switch_creds(fd) is more or less the same thing as
  switch_creds(ruid, euid, suid, rgid, egid, sgid, groups, mac
  label, ...).  switch_creds needs to verify that the caller can
  dyntransition to the label, set all the ids, etc., but it avoids
  allocating anything and running RCU callbacks.
  
  The trouble with this is that the verification needed is complicated
  and expensive.  And I think that my proposal is potentially more
  useful.
  
 
 Is it really though? My understanding of the problem was that it was
 the syscall (context switching) overhead + having to do a bunch of RCU
 critical stuff that was the problem. If we can do all of this in the
 context of a single RCU critical section, isn't that still a win?
 
 As to the complicated part...maybe but it doesn't seem like it would
 have to be. We could simply return -EINVAL or something if the old
 struct cred doesn't have fields that match the ones we're replacing
 and that we don't expect to see changed.
 
  A credfd is like a struct cred

Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-28 Thread Jeff Layton
On Sun, 27 Apr 2014 14:51:25 +1000
NeilBrown ne...@suse.de wrote:

 On Tue, 22 Apr 2014 06:54:36 +0200 Michael Kerrisk (man-pages)
 mtk.manpa...@gmail.com wrote:
 
  On 04/21/2014 11:15 PM, Stefan (metze) Metzmacher wrote:
   Am 21.04.2014 21:55, schrieb Jeff Layton:
   On Mon, 21 Apr 2014 21:39:12 +0200
   Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:
  
   On 04/21/2014 08:46 PM, Rich Felker wrote:
   On Mon, Apr 21, 2014 at 08:32:44PM +0200, Michael Kerrisk (man-pages) 
   wrote:
   On 04/21/2014 06:10 PM, Rich Felker wrote:
   I'm well aware of that. The problem is that the proposed API is using
   the two-letter abbreviation FD, which ALWAYS means file descriptor 
   and
   NEVER means file description (in existing usage) to mean file
   description. That's what's wrong.
  
   So, can you *please* answer this question: what do you call (i.e., 
   what  everyday technical language term do use for) the thing
   that sits between a file descriptor and an i-node? 
  
   (Please don't say 'struct file' -- that is not is an implementation 
   detail, and does not qualify as the kind of term that I could use 
   when documenting this feature in man pages.)
  
   Open file description.
  
   Oh! I didn't realize we agreed :-).
  
   POSIX uses (or invented, I am not sure which) the term file 
   description
   for a good reason: it is unambiguous, and therefore precise. I do 
   agree
   that there's a risk of confusion between 'open file descriptor and 
   'and file description'--it's the same kind of risk as between English 
   terms such as 'arbitrator' and 'arbitration' (and any number of other
   examples), and as language speakers we deal with this every day.
  
   There's not a problem when the full word is used. On the other hand,
   if you use arb as an abbreviation for arbitration in a context
   where it was already universally understood as meaning arbitrator,
   that would be a big problem.
  
   Likewise the problem here isn't that open file description is a bad
   term. It's that using FD to mean [open] file description is
   utterly confusing, even moreso than just making up a new completely
   random word.
  
   Ohh -- I had thought you a problem not just with FD but also
   (open) file description.
  
   2) The new API constants (F_SETLKP, F_SETLKPW, F_GETLKP) have names
  that are visually very close to the traditional POSIX lock names 
  (F_SETLK, F_SETLKW, F_GETLK). That's an accident waiting to 
   happen
  when someone mistypes in code and/or misses such a misttyping
  when reading code. That really must be fixed.
  
   I agree, but I don't think making it worse is a solution.
  
   I don't agree that it's making it worse. The real problem here is 
   that people use no good unambiguous term for the thing between a file
   descriptor and an inode. POSIX provides us with a solution that may
   not seem perfect, but it is unambiguous, and I think it might feel
   more comfortable if we used it often enough.
  
   I would like to see it used more too, and in particular, I think it
   belongs in the documentation for these new locking interfaces. But
   that still doesn't answer the question of what to call them (the
   macros) unless you want:
  
   F_OPEN_FILE_DESCRIPTION_GETLK
   F_OPEN_FILE_DESCRIPTION_SETLK
   F_OPEN_FILE_DESCRIPTION_SETLKW
  
   Or just 'F_OFD_*'?
  
   Perhaps OP (for open-private, i.e. private to the particular open)
   would be a sensible choice; OTOH people are likely to misread it as
   OPeration. The general principle I have in mind though is that it
   might be nice to highlight the word open in open file description
  
   (Fair enough.)
  
   since it (1) contrasts with file descriptor, despite file descriptors
   also dealing with open files, and (2) contrasts well with legacy fcntl
   locks, which are (this is the whole bug) associated with the
   underlying file and not the open file description.
  
   Makes sense to me. (We are in more agreement that I realized.)
  
   Cheers,
  
   Michael
  
  
  
  
   So the motion is to call them open file description locks and change
   the macros to read *_OFD_*. Does anyone object?
   
   Works fine for me...
  
  And works for me.
  
  
 
 I think the word open is important here.
 I find that description is not a word I would have every thought was
 relevant here - it is obviously too long since I have read the man pages.
 
 I would prefer
   per-open locks
 which are contrasted with
   per-process locks
 
 An alternative might be flock-like as locks created with flock have
 exactly the property we are trying to describe.  Reading the man page for
 flock then suggests open file table entry locks which is even more of a
 mouthful.  oftel is pronounceable though.  Then we could talk about oftel
 locks in the same sentence as pin numbers and RAM memory.
 
 But maybe I came too late to this party, and the boat has sailed?
 
 Note to Michael: The text
flock() does not lock files over

Re: [PATCH -V1 05/22] vfs: Add new file and directory create permission flags

2014-04-28 Thread Jeff Layton
On Sun, 27 Apr 2014 21:44:36 +0530
Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com wrote:

 From: Andreas Gruenbacher agr...@kernel.org
 
 Some permission models distinguish between the permission to create a
 non-directory and a directory.  Pass this information down to
 inode_permission() as mask flags
 
 Signed-off-by: Andreas Gruenbacher agr...@kernel.org
 Signed-off-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com
 ---
  fs/namei.c | 21 -
  include/linux/fs.h |  2 ++
  2 files changed, 14 insertions(+), 9 deletions(-)
 
 diff --git a/fs/namei.c b/fs/namei.c
 index 7d93d195f0e5..028bc8bcf77c 100644
 --- a/fs/namei.c
 +++ b/fs/namei.c
 @@ -445,7 +445,8 @@ static int sb_permission(struct super_block *sb, struct 
 inode *inode, int mask)
   * this, letting us set arbitrary permissions for filesystem access without
   * changing the normal UIDs which are used for other things.
   *
 - * When checking for MAY_APPEND, MAY_WRITE must also be set in @mask.
 + * When checking for MAY_APPEND, MAY_CREATE_FILE, MAY_CREATE_DIR,
 + * MAY_WRITE must also be set in @mask.
   */
  int inode_permission(struct inode *inode, int mask)
  {
 @@ -2444,14 +2445,16 @@ static int may_delete(struct inode *dir, struct 
 dentry *victim, bool isdir)
   *  3. We should have write and exec permissions on dir
   *  4. We can't do it if dir is immutable (done in permission())
   */
 -static inline int may_create(struct inode *dir, struct dentry *child)
 +static inline int may_create(struct inode *dir, struct dentry *child, bool 
 isdir)
  {
 + int mask = isdir ? MAY_CREATE_DIR : MAY_CREATE_FILE;
 +
   audit_inode_child(dir, child, AUDIT_TYPE_CHILD_CREATE);
   if (child-d_inode)
   return -EEXIST;
   if (IS_DEADDIR(dir))
   return -ENOENT;
 - return inode_permission(dir, MAY_WRITE | MAY_EXEC);
 + return inode_permission(dir, MAY_WRITE | MAY_EXEC | mask);
  }
  
  /*
 @@ -2501,7 +2504,7 @@ EXPORT_SYMBOL(unlock_rename);
  int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
   bool want_excl)
  {
 - int error = may_create(dir, dentry);
 + int error = may_create(dir, dentry, 0);

nit: since the third argument here is a bool, this should be false
and not 0. Ditto for all of the other calls of may_create.

   if (error)
   return error;
  
 @@ -3363,7 +3366,7 @@ EXPORT_SYMBOL(user_path_create);
  
  int vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t 
 dev)
  {
 - int error = may_create(dir, dentry);
 + int error = may_create(dir, dentry, 0);
  
   if (error)
   return error;
 @@ -3455,7 +3458,7 @@ SYSCALL_DEFINE3(mknod, const char __user *, filename, 
 umode_t, mode, unsigned, d
  
  int vfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
  {
 - int error = may_create(dir, dentry);
 + int error = may_create(dir, dentry, 1);
   unsigned max_links = dir-i_sb-s_max_links;
  
   if (error)
 @@ -3785,7 +3788,7 @@ SYSCALL_DEFINE1(unlink, const char __user *, pathname)
  
  int vfs_symlink(struct inode *dir, struct dentry *dentry, const char 
 *oldname)
  {
 - int error = may_create(dir, dentry);
 + int error = may_create(dir, dentry, 0);
  
   if (error)
   return error;
 @@ -3871,7 +3874,7 @@ int vfs_link(struct dentry *old_dentry, struct inode 
 *dir, struct dentry *new_de
   if (S_ISDIR(inode-i_mode))
   return -EPERM;
  
 - error = may_create(dir, new_dentry);
 + error = may_create(dir, new_dentry, 0);
   if (error)
   return error;
  
 @@ -4062,7 +4065,7 @@ int vfs_rename(struct inode *old_dir, struct dentry 
 *old_dentry,
   return error;
  
   if (!target) {
 - error = may_create(new_dir, new_dentry);
 + error = may_create(new_dir, new_dentry, is_dir);
   } else {
   new_is_dir = d_is_dir(new_dentry);
  
 diff --git a/include/linux/fs.h b/include/linux/fs.h
 index 1f524b4b652d..da5521de04ab 100644
 --- a/include/linux/fs.h
 +++ b/include/linux/fs.h
 @@ -77,6 +77,8 @@ typedef void (dio_iodone_t)(struct kiocb *iocb, loff_t 
 offset,
  #define MAY_CHDIR0x0040
  /* called from RCU mode, don't block */
  #define MAY_NOT_BLOCK0x0080
 +#define MAY_CREATE_FILE  0x0100
 +#define MAY_CREATE_DIR   0x0200
  
  /*
   * flags in file.f_mode.  Note that FMODE_READ and FMODE_WRITE must 
 correspond


-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] locks: ensure that fl_owner is always initialized properly in flock and lease codepaths

2014-04-28 Thread Jeff Layton
Currently, the fl_owner isn't set for flock locks. Some filesystems use
byte-range locks to simulate flock locks and there is a common idiom in
those that does:

fl-fl_owner = (fl_owner_t)filp;
fl-fl_start = 0;
fl-fl_end = OFFSET_MAX;

Since flock locks are generally owned by the open file description,
move this into the common flock lock setup code. The fl_start and fl_end
fields are already set appropriately, so remove the unneeded setting of
that in flock ops in those filesystems as well.

Finally, the lease code also sets the fl_owner as if they were owned by
the process and not the open file description. This is incorrect as
leases have the same ownership semantics as flock locks. Set them the
same way. The lease code doesn't actually use the fl_owner value for
anything, so this is more for consistency's sake than a bugfix.

Reported-by: Trond Myklebust trond.mykleb...@primarydata.com
Signed-off-by: Jeff Layton jlay...@poochiereds.net
---
 drivers/staging/lustre/lustre/llite/file.c | 17 ++---
 fs/9p/vfs_file.c   |  3 ---
 fs/afs/flock.c |  4 
 fs/ceph/locks.c| 10 ++
 fs/fuse/file.c |  1 -
 fs/locks.c |  4 +++-
 fs/nfs/file.c  |  4 
 7 files changed, 11 insertions(+), 32 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 8e844a6371e0..760ccd83f1f7 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -2691,20 +2691,15 @@ int ll_file_flock(struct file *file, int cmd, struct 
file_lock *file_lock)
 
ll_stats_ops_tally(ll_i2sbi(inode), LPROC_LL_FLOCK, 1);
 
-   if (file_lock-fl_flags  FL_FLOCK) {
+   if (file_lock-fl_flags  FL_FLOCK)
LASSERT((cmd == F_SETLKW) || (cmd == F_SETLK));
-   /* flocks are whole-file locks */
-   flock.l_flock.end = OFFSET_MAX;
-   /* For flocks owner is determined by the local file descriptor*/
-   flock.l_flock.owner = (unsigned long)file_lock-fl_file;
-   } else if (file_lock-fl_flags  FL_POSIX) {
-   flock.l_flock.owner = (unsigned long)file_lock-fl_owner;
-   flock.l_flock.start = file_lock-fl_start;
-   flock.l_flock.end = file_lock-fl_end;
-   } else {
+   else if (!(file_lock-fl_flags  FL_POSIX))
return -EINVAL;
-   }
+
+   flock.l_flock.owner = (unsigned long)file_lock-fl_owner;
flock.l_flock.pid = file_lock-fl_pid;
+   flock.l_flock.start = file_lock-fl_start;
+   flock.l_flock.end = file_lock-fl_end;
 
/* Somewhat ugly workaround for svc lockd.
 * lockd installs custom fl_lmops-lm_compare_owner that checks
diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
index d8223209d4b1..59e3fe3d56c0 100644
--- a/fs/9p/vfs_file.c
+++ b/fs/9p/vfs_file.c
@@ -352,9 +352,6 @@ static int v9fs_file_flock_dotl(struct file *filp, int cmd,
invalidate_mapping_pages(inode-i_data, 0, -1);
}
/* Convert flock to posix lock */
-   fl-fl_owner = (fl_owner_t)filp;
-   fl-fl_start = 0;
-   fl-fl_end = OFFSET_MAX;
fl-fl_flags |= FL_POSIX;
fl-fl_flags ^= FL_FLOCK;
 
diff --git a/fs/afs/flock.c b/fs/afs/flock.c
index a8cf2cff836c..4baf1d2b39e4 100644
--- a/fs/afs/flock.c
+++ b/fs/afs/flock.c
@@ -555,10 +555,6 @@ int afs_flock(struct file *file, int cmd, struct file_lock 
*fl)
return -ENOLCK;
 
/* we're simulating flock() locks using posix locks on the server */
-   fl-fl_owner = (fl_owner_t) file;
-   fl-fl_start = 0;
-   fl-fl_end = OFFSET_MAX;
-
if (fl-fl_type == F_UNLCK)
return afs_do_unlk(file, fl);
return afs_do_setlk(file, fl);
diff --git a/fs/ceph/locks.c b/fs/ceph/locks.c
index d94ba0df9f4d..db8c1ca15d72 100644
--- a/fs/ceph/locks.c
+++ b/fs/ceph/locks.c
@@ -52,10 +52,7 @@ static int ceph_lock_message(u8 lock_type, u16 operation, 
struct file *file,
else
length = fl-fl_end - fl-fl_start + 1;
 
-   if (lock_type == CEPH_LOCK_FCNTL)
-   owner = secure_addr(fl-fl_owner);
-   else
-   owner = secure_addr(fl-fl_file);
+   owner = secure_addr(fl-fl_owner);
 
dout(ceph_lock_message: rule: %d, op: %d, owner: %llx, pid: %llu, 
 start: %llu, length: %llu, wait: %d, type: %d, (int)lock_type,
@@ -313,10 +310,7 @@ int lock_to_ceph_filelock(struct file_lock *lock,
cephlock-length = cpu_to_le64(lock-fl_end - lock-fl_start + 1);
cephlock-client = cpu_to_le64(0);
cephlock-pid = cpu_to_le64((u64)lock-fl_pid);
-   if (lock-fl_flags  FL_POSIX)
-   cephlock-owner = cpu_to_le64(secure_addr(lock-fl_owner));
-   else
-   cephlock-owner = cpu_to_le64

Re: OFD (file private) locks and NFS

2014-04-29 Thread Jeff Layton
On Tue, 29 Apr 2014 10:47:07 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 [CC+= linux-nfs@]
 
 On 04/29/2014 10:38 AM, Michael Kerrisk (man-pages) wrote:
  Hi Jeff,
  
  I've been looking a bit at the fcntl() documentation of traditional 
  (F_SETLK) record locking, and a question just jumped out at me. Is 
  it worth considering some future-proofing in the design of OFD locks
  (open file description locks, formerly known as file-private locks)?
  
  What I am thinking of here is that on some systems, the traditional
  'struct flock' has a nonstandard field, l_sysid, that is used on F_GETLK 
  to identify the remote system on which a lock is held. Should the design
  of OFD locks allow for such a field (now, or in the future), which might 
  be useful in the context of locking on network file systems such as NFS.
  
  Put more simply, should the new OFD locking system be using a new
  structure for describing locks, rather than the traditional 'struct
  flock'? Defining a new structure, might be useful to allow for
  future extensions to the API.
 
 Just add one further detail here. What I'm thinking is, maybe instead there
 should be something like:
 
 struct flockx {
 int flags;
 /* Other fields like 'struct flock' */
 char reserved[32];/* Or some suitable value */
 }
 
 That flags field might always be zero for now, but in the future it 
 could be used on the setlk and getlk operations to indicate the presence
 of additional fields in the structure.
 
 Cheers,
 
 Michael
 

I considered that early on when I did this, but I don't think it's
really helpful. I'm just not a fan of padding out structs when it's not
clear what would eventually go in there.

The problem is that once actually go to try to convert those from
reserved to something usable, it becomes a nightmare for userland to
figure that out. How do I know whether my kernel supports the stuff I
put in those fields or will just ignore them?

If we really find later that we need to do something like this, I think
we'd be better off adding a new set of cmd values along with the
extended struct, or possibly a new syscall. Some of the samba folks
were interested in an async locking mechanism too, so something like
that could be added in conjunction with such an interface.

-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: flock() and NFS [Was: Re: [PATCH] locks: rename file-private locks to file-description locks]

2014-04-29 Thread Jeff Layton
is. There is a relationship between the two that's illustrated in the
comments above write_gracetime:

/**
 * write_gracetime - Set or report current NFSv4 grace period time
 *
 * As above, but sets the time of the NFSv4 grace period.
 *
 * Note this should never be set to less than the *previous*
 * lease-period time, but we don't try to enforce this.  (In the common
 * case (a new boot), we don't know what the previous lease time was
 * anyway.)
 */

The value you're interested in here is the nfsv4leasetime. If the
client doesn't renew its lease within that period, then it's subject to
the server giving up on it and dropping any state that it holds on that
clients' behalf.

Note that this is not a firm timeout. The server runs a job
periodically to clean out expired stateful objects, and it's likely
that there is some time (maybe even up to another whole lease period)
between when the timeout expires and the job actually runs. If the
client gets a RENEW in there within that window, its lease will be
renewed and its state preserved.

Also note that all of the above just applies to the Linux knfsd. There
are many other servers in the field and they have different rules for
dropping state held by clients that have gone AWOL.

-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: OFD (file private) locks and NFS

2014-04-29 Thread Jeff Layton
On Tue, 29 Apr 2014 07:40:08 -0400 (EDT)
Matt W. Benjamin m...@linuxbox.com wrote:

 Hi Jeff,
 
 Something which came up on the last Ganesha conn call is that we have
 a pretty strong need for some ability to wait on a set of locks, and perhaps
 receive events.  Frank Filz believed that you had made a proposal which
 would cover this.  Can you elaborate on that?
 
 Thanks,
 
 Matt
 

No, there's no mechanism to wait on a set of locks from within the
context of a single thread of execution or to receive events. Again,
that would be a new API beyond what I've been proposing over the last
several months.
-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH man-pages v1] fcntl.2: update manpage with verbiage about open file description locks

2014-04-29 Thread Jeff Layton
Signed-off-by: Jeff Layton jlay...@poochiereds.net
---
 man2/fcntl.2 | 112 +--
 1 file changed, 109 insertions(+), 3 deletions(-)

diff --git a/man2/fcntl.2 b/man2/fcntl.2
index d0154a6d9f42..8d119dfec24c 100644
--- a/man2/fcntl.2
+++ b/man2/fcntl.2
@@ -191,6 +191,9 @@ and
 .BR O_SYNC
 flags; see BUGS, below.
 .SS Advisory locking
+This section describes traditional POSIX record locks. Also see the section on
+open file description locks below.
+.PP
 .BR F_SETLK ,
 .BR F_SETLKW ,
 and
@@ -213,7 +216,8 @@ struct flock {
 off_t l_start;   /* Starting offset for lock */
 off_t l_len; /* Number of bytes to lock */
 pid_t l_pid; /* PID of process blocking our lock
-(F_GETLK only) */
+(returned for F_GETLK and F_OFD_GETLK only. Set
+ to 0 for open file description locks) */
 ...
 };
 .fi
@@ -349,9 +353,13 @@ returns details about one of these locks in the
 .IR l_type ,  l_whence ,  l_start , and  l_len
 fields of
 .I lock
-and sets
+.
+If the conflicting lock is a traditional POSIX lock, then the
+.I l_pid
+to be the PID of the process holding that lock. If the
+conflicting lock is an open file description lock, then the
 .I l_pid
-to be the PID of the process holding that lock.
+will be set to \-1.
 Note that the information returned by
 .BR F_GETLK
 may already be out of date by the time the caller inspects it.
@@ -394,6 +402,104 @@ should be avoided; use
 and
 .BR write (2)
 instead.
+.SS Open file description locks (non-POSIX)
+.BR F_OFD_GETLK ,  F_OFD_SETLK  and   F_OFD_SETLKW
+are used to acquire, release and test open file description record locks.
+These are byte-range locks that work identically to the traditional advisory
+record locks described above, but are associated with the open file description
+on which they were acquired rather than the process, much like locks acquired
+with
+.BR flock (2)
+.
+.PP
+Unlike traditional advisory record locks, these locks are inherited
+across
+.BR fork (2)
+and
+.BR clone (2)
+with
+.BR CLONE_FILES
+and are only released on the last close of the open file description instead
+of being released on any close of the file.
+.PP
+Open file description locks always conflict with traditional record locks,
+even when they are acquired by the same process on the same file descriptor.
+They only conflict with each other when they are acquired on different
+open file descriptions.
+.PP
+Note that in contrast to traditional record locks, the
+.I flock
+structure passed in as an argument to the open file description lock commands
+must have the
+.I l_pid
+value set to 0.
+.TP
+.BR F_OFD_SETLK  (\fIstruct flock *\fP)
+Acquire an open file description lock (when
+.I l_type
+is
+.B F_RDLCK
+or
+.BR F_WRLCK )
+or release an open file description lock (when
+.I l_type
+is
+.BR F_UNLCK )
+on the bytes specified by the
+.IR l_whence ,  l_start , and  l_len
+fields of
+.IR lock .
+If a conflicting lock is held by another process,
+this call returns \-1 and sets
+.I errno
+to
+.B EACCES
+or
+.BR EAGAIN .
+.TP
+.BR F_OFD_SETLKW  (\fIstruct flock *\fP)
+As for
+.BR F_OFD_SETLK ,
+but if a conflicting lock is held on the file, then wait for that lock to be
+released. If a signal is caught while waiting, then the call is interrupted
+and (after the signal handler has returned) returns immediately (with return
+value \-1 and
+.I errno
+set to
+.BR EINTR ;
+see
+.BR signal (7)).
+.TP
+.BR F_OFD_GETLK  (\fIstruct flock *\fP)
+On input to this call,
+.I lock
+describes an open file description lock we would like to place on the file.
+If the lock could be placed,
+.BR fcntl ()
+does not actually place it, but returns
+.B F_UNLCK
+in the
+.I l_type
+field of
+.I lock
+and leaves the other fields of the structure unchanged.
+If one or more incompatible locks would prevent
+this lock being placed, then
+.BR fcntl ()
+returns details about one of these locks in the
+.IR l_type ,  l_whence ,  l_start , and  l_len
+fields of
+.I lock
+.
+If the conflicting lock is a process-associated record lock, then the
+.I l_pid
+will be set to the PID of the process holding that lock. If the
+conflicting lock is an open file description lock, then the
+.I l_pid
+will be set to -1 to indicate that it is not associated with a process.
+Note that the information returned by
+.BR F_OFD_GETLK
+may already be out of date by the time the caller inspects it.
 .SS Mandatory locking
 (Non-POSIX.)
 The above record locks may be either advisory or mandatory,
-- 
1.9.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH man-pages v1] fcntl.2: update manpage with verbiage about open file description locks

2014-04-30 Thread Jeff Layton
On Wed, 30 Apr 2014 12:50:23 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 [CC += linux-man]
 
 Jeff,
 
 Thanks very much for writing this patch!
 
 I've taken your patch into a branch and add a number of details. I have 
 one or two questions below.
 
 On 04/29/2014 08:51 PM, Jeff Layton wrote:
  Signed-off-by: Jeff Layton jlay...@poochiereds.net
  ---
   man2/fcntl.2 | 112 
  +--
   1 file changed, 109 insertions(+), 3 deletions(-)
  
  diff --git a/man2/fcntl.2 b/man2/fcntl.2
  index d0154a6d9f42..8d119dfec24c 100644
  --- a/man2/fcntl.2
  +++ b/man2/fcntl.2
  @@ -191,6 +191,9 @@ and
   .BR O_SYNC
   flags; see BUGS, below.
   .SS Advisory locking
  +This section describes traditional POSIX record locks. Also see the 
  section on
  +open file description locks below.
  +.PP
   .BR F_SETLK ,
   .BR F_SETLKW ,
   and
  @@ -213,7 +216,8 @@ struct flock {
   off_t l_start;   /* Starting offset for lock */
   off_t l_len; /* Number of bytes to lock */
   pid_t l_pid; /* PID of process blocking our lock
  -(F_GETLK only) */
  +(returned for F_GETLK and F_OFD_GETLK only. Set
  + to 0 for open file description locks) */
   ...
   };
   .fi
  @@ -349,9 +353,13 @@ returns details about one of these locks in the
   .IR l_type ,  l_whence ,  l_start , and  l_len
   fields of
   .I lock
  -and sets
  +.
  +If the conflicting lock is a traditional POSIX lock, then the
  +.I l_pid
  +to be the PID of the process holding that lock. If the
  +conflicting lock is an open file description lock, then the
   .I l_pid
  -to be the PID of the process holding that lock.
  +will be set to \-1.
   Note that the information returned by
   .BR F_GETLK
   may already be out of date by the time the caller inspects it.
  @@ -394,6 +402,104 @@ should be avoided; use
   and
   .BR write (2)
   instead.
  +.SS Open file description locks (non-POSIX)
  +.BR F_OFD_GETLK ,  F_OFD_SETLK  and   F_OFD_SETLKW
  +are used to acquire, release and test open file description record locks.
  +These are byte-range locks that work identically to the traditional 
  advisory
  +record locks described above, but are associated with the open file 
  description
  +on which they were acquired rather than the process, much like locks 
  acquired
  +with
  +.BR flock (2)
  +.
  +.PP
  +Unlike traditional advisory record locks, these locks are inherited
  +across
  +.BR fork (2)
  +and
  +.BR clone (2)
  +with
  +.BR CLONE_FILES
  +and are only released on the last close of the open file description 
  instead
  +of being released on any close of the file.
  +.PP
  +Open file description locks always conflict with traditional record locks,
  +even when they are acquired by the same process on the same file 
  descriptor.
  +They only conflict with each other when they are acquired on different
  +open file descriptions.
  +.PP
  +Note that in contrast to traditional record locks, the
  +.I flock
  +structure passed in as an argument to the open file description lock 
  commands
  +must have the
  +.I l_pid
  +value set to 0.
 
 In ERRORS, I added EINVAL for this case.
 
  +.TP
  +.BR F_OFD_SETLK  (\fIstruct flock *\fP)
  +Acquire an open file description lock (when
  +.I l_type
  +is
  +.B F_RDLCK
  +or
  +.BR F_WRLCK )
  +or release an open file description lock (when
  +.I l_type
  +is
  +.BR F_UNLCK )
  +on the bytes specified by the
  +.IR l_whence ,  l_start , and  l_len
  +fields of
  +.IR lock .
  +If a conflicting lock is held by another process,
  +this call returns \-1 and sets
  +.I errno
  +to
  +.B EACCES
  +or
  +.BR EAGAIN .
 
 The EACCES or EAGAIN thing comes from POSIX, because different 
 implementations of tradition record locks returned one of these errors. 
 So, portable applications using traditional locks must handle either 
 possibility. However, that argument doesn't apply for these new locks. 
 Surely, we just want to say set errno to EAGAIN for this case?
 
  +.TP
  +.BR F_OFD_SETLKW  (\fIstruct flock *\fP)
  +As for
  +.BR F_OFD_SETLK ,
  +but if a conflicting lock is held on the file, then wait for that lock to 
  be
  +released. If a signal is caught while waiting, then the call is interrupted
  +and (after the signal handler has returned) returns immediately (with 
  return
  +value \-1 and
  +.I errno
  +set to
  +.BR EINTR ;
  +see
  +.BR signal (7)).
  +.TP
  +.BR F_OFD_GETLK  (\fIstruct flock *\fP)
  +On input to this call,
  +.I lock
  +describes an open file description lock we would like to place on the file.
  +If the lock could be placed,
  +.BR fcntl ()
  +does not actually place it, but returns
  +.B F_UNLCK
  +in the
  +.I l_type
  +field of
  +.I lock
  +and leaves the other fields of the structure unchanged.
  +If one or more incompatible locks would prevent
  +this lock being placed, then
  +.BR fcntl ()
  +returns details about one

Re: [PATCH 1/2] cifs: Use min_t() when comparing size_t and unsigned long

2014-05-02 Thread Jeff Layton
On Sun, 13 Apr 2014 20:46:21 +0200
Geert Uytterhoeven ge...@linux-m68k.org wrote:

 On 32 bit, size_t is unsigned int, not unsigned long, causing the
 following warning when comparing with PAGE_SIZE, which is always unsigned
 long:
 
 fs/cifs/file.c: In function ‘cifs_readdata_to_iov’:
 fs/cifs/file.c:2757: warning: comparison of distinct pointer types lacks a 
 cast
 
 Introduced by commit 7f25bba819a38ab7310024a9350655f374707e20
 (cifs_iovec_read: keep iov_iter between the calls of
 cifs_readdata_to_iov()), which changed the signedness of remaining
 and the code from min_t() to min().
 
 Signed-off-by: Geert Uytterhoeven ge...@linux-m68k.org
 ---
 PAGE_SIZE should really be size_t, but that would require lots of changes
 all over the place.
 
  fs/cifs/file.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/fs/cifs/file.c b/fs/cifs/file.c
 index 8807442c94dd..8add25538a3b 100644
 --- a/fs/cifs/file.c
 +++ b/fs/cifs/file.c
 @@ -2754,7 +2754,7 @@ cifs_readdata_to_iov(struct cifs_readdata *rdata, 
 struct iov_iter *iter)
  
   for (i = 0; i  rdata-nr_pages; i++) {
   struct page *page = rdata-pages[i];
 - size_t copy = min(remaining, PAGE_SIZE);
 + size_t copy = min_t(size_t, remaining, PAGE_SIZE);
   size_t written = copy_page_to_iter(page, 0, copy, iter);
   remaining -= written;
   if (written  copy  iov_iter_count(iter)  0)

Reviewed-by: Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[glibc PATCH v4 resend] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-05-03 Thread Jeff Layton
Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@poochiereds.net

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated with the process.
+There is also a different type of record lock that is associated with the
+open file description instead of the process

Re: [PATCH 1/1] fs/locks.c: replace seq_printf by seq_puts

2014-05-03 Thread Jeff Layton
On Sat, 3 May 2014 22:28:43 +0200
Fabian Frederick f...@skynet.be wrote:

 Replace seq_printf where possible
 
 Cc: Jeff Layton jlay...@redhat.com
 Cc: Alexander Viro v...@zeniv.linux.org.uk
 Cc: Andrew Morton a...@linux-foundation.org
 Signed-off-by: Fabian Frederick f...@skynet.be
 ---
  fs/locks.c | 22 +++---
  1 file changed, 11 insertions(+), 11 deletions(-)
 
 diff --git a/fs/locks.c b/fs/locks.c
 index e663aea..78ac209 100644
 --- a/fs/locks.c
 +++ b/fs/locks.c
 @@ -2411,31 +2411,31 @@ static void lock_get_status(struct seq_file *f, 
 struct file_lock *fl,
   seq_printf(f, %lld:%s , id, pfx);
   if (IS_POSIX(fl)) {
   if (fl-fl_flags  FL_ACCESS)
 - seq_printf(f, ACCESS);
 + seq_puts(f, ACCESS);
   else if (IS_OFDLCK(fl))
 - seq_printf(f, OFDLCK);
 + seq_puts(f, OFDLCK);
   else
 - seq_printf(f, POSIX );
 + seq_puts(f, POSIX );
  
   seq_printf(f,  %s ,
(inode == NULL) ? *NOINODE* :
mandatory_lock(inode) ? MANDATORY : ADVISORY );
   } else if (IS_FLOCK(fl)) {
   if (fl-fl_type  LOCK_MAND) {
 - seq_printf(f, FLOCK  MSNFS );
 + seq_puts(f, FLOCK  MSNFS );
   } else {
 - seq_printf(f, FLOCK  ADVISORY  );
 + seq_puts(f, FLOCK  ADVISORY  );
   }
   } else if (IS_LEASE(fl)) {
 - seq_printf(f, LEASE  );
 + seq_puts(f, LEASE  );
   if (lease_breaking(fl))
 - seq_printf(f, BREAKING  );
 + seq_puts(f, BREAKING  );
   else if (fl-fl_file)
 - seq_printf(f, ACTIVE);
 + seq_puts(f, ACTIVE);
   else
 - seq_printf(f, BREAKER   );
 + seq_puts(f, BREAKER   );
   } else {
 - seq_printf(f, UNKNOWN UNKNOWN  );
 + seq_puts(f, UNKNOWN UNKNOWN  );
   }
   if (fl-fl_type  LOCK_MAND) {
   seq_printf(f, %s ,
 @@ -2467,7 +2467,7 @@ static void lock_get_status(struct seq_file *f, struct 
 file_lock *fl,
   else
   seq_printf(f, %Ld %Ld\n, fl-fl_start, fl-fl_end);
   } else {
 - seq_printf(f, 0 EOF\n);
 + seq_puts(f, 0 EOF\n);
   }
  }
  

Looks good to me. I'll plan to merge it for v3.16.

Thanks!
-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] MAINTAINERS email address change

2014-05-04 Thread Jeff Layton
The following changes since commit 8aa9e85adac609588eeec356e5a85059b3b819ba:

  ARC: !PREEMPT: Ensure Return to kernel mode is IRQ safe (2014-04-30 08:21:43 
-0700)

are available in the git repository at:

  git://git.samba.org/jlayton/linux.git tags/locks-v3.15-3

for you to fetch changes up to 8c836fa85bebfea26ceff024b5d1c518bf63f3d8:

  MAINTAINERS: email address change for Jeff Layton (2014-04-30 13:28:17 -0400)


File locking related changes for v3.15 (pile #3)

- only an email address change to the MAINTAINERS file


Jeff Layton (1):
  MAINTAINERS: email address change for Jeff Layton

 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


-- 
Jeff Layton jlay...@poochiereds.net


signature.asc
Description: PGP signature


Re: [PATCH 1/1] FS/CIFS: remove obsolete __constant

2014-05-04 Thread Jeff Layton
On Sat, May 3, 2014 at 4:15 PM, Fabian Frederick f...@skynet.be wrote:
 Replacing all __constant_foo to foo()
 except in smb2status.h (1700 lines to update).

 Cc: linux-c...@vger.kernel.org
 Cc: Steve French sfre...@samba.org
 Cc: Andrew Morton a...@linux-foundation.org
 Signed-off-by: Fabian Frederick f...@skynet.be
 ---
  fs/cifs/cifsacl.c  |  2 +-
  fs/cifs/cifssmb.c  | 20 ++--
  fs/cifs/sess.c |  2 +-
  fs/cifs/smb2misc.c | 38 +++---
  fs/cifs/smb2ops.c  |  2 +-
  fs/cifs/smb2pdu.c  |  2 +-
  fs/cifs/smb2pdu.h  | 28 ++--
  7 files changed, 47 insertions(+), 47 deletions(-)

 diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
 index 7ff866d..54ac0e8 100644
 --- a/fs/cifs/cifsacl.c
 +++ b/fs/cifs/cifsacl.c
 @@ -38,7 +38,7 @@ static const struct cifs_sid sid_everyone = {
 1, 1, {0, 0, 0, 0, 0, 1}, {0} };
  /* security id for Authenticated Users system group */
  static const struct cifs_sid sid_authusers = {
 -   1, 1, {0, 0, 0, 0, 0, 5}, {__constant_cpu_to_le32(11)} };
 +   1, 1, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(11)} };


Does the build still work on BE arches with this? I know at one point
the above wouldn't compile on those
arches. See commit 536abdb0802f, for an explanation.


  /* group users */
  static const struct cifs_sid sid_user = {1, 2 , {0, 0, 0, 0, 0, 5}, {} };

 diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
 index 6ce4e09..c3dc52e 100644
 --- a/fs/cifs/cifssmb.c
 +++ b/fs/cifs/cifssmb.c
 @@ -2430,14 +2430,14 @@ CIFSSMBPosixLock(const unsigned int xid, struct 
 cifs_tcon *tcon,
 }
 parm_data = (struct cifs_posix_lock *)
 ((char *)pSMBr-hdr.Protocol + data_offset);
 -   if (parm_data-lock_type == 
 __constant_cpu_to_le16(CIFS_UNLCK))
 +   if (parm_data-lock_type == cpu_to_le16(CIFS_UNLCK))
 pLockData-fl_type = F_UNLCK;
 else {
 if (parm_data-lock_type ==
 -   __constant_cpu_to_le16(CIFS_RDLCK))
 +   cpu_to_le16(CIFS_RDLCK))
 pLockData-fl_type = F_RDLCK;
 else if (parm_data-lock_type ==
 -   __constant_cpu_to_le16(CIFS_WRLCK))
 +   cpu_to_le16(CIFS_WRLCK))
 pLockData-fl_type = F_WRLCK;

 pLockData-fl_start = le64_to_cpu(parm_data-start);
 @@ -3232,25 +3232,25 @@ CIFSSMB_set_compression(const unsigned int xid, 
 struct cifs_tcon *tcon,
 pSMB-compression_state = cpu_to_le16(COMPRESSION_FORMAT_DEFAULT);

 pSMB-TotalParameterCount = 0;
 -   pSMB-TotalDataCount = __constant_cpu_to_le32(2);
 +   pSMB-TotalDataCount = cpu_to_le32(2);
 pSMB-MaxParameterCount = 0;
 pSMB-MaxDataCount = 0;
 pSMB-MaxSetupCount = 4;
 pSMB-Reserved = 0;
 pSMB-ParameterOffset = 0;
 -   pSMB-DataCount = __constant_cpu_to_le32(2);
 +   pSMB-DataCount = cpu_to_le32(2);
 pSMB-DataOffset =
 cpu_to_le32(offsetof(struct 
 smb_com_transaction_compr_ioctl_req,
 compression_state) - 4);  /* 84 */
 pSMB-SetupCount = 4;
 -   pSMB-SubCommand = __constant_cpu_to_le16(NT_TRANSACT_IOCTL);
 +   pSMB-SubCommand = cpu_to_le16(NT_TRANSACT_IOCTL);
 pSMB-ParameterCount = 0;
 -   pSMB-FunctionCode = __constant_cpu_to_le32(FSCTL_SET_COMPRESSION);
 +   pSMB-FunctionCode = cpu_to_le32(FSCTL_SET_COMPRESSION);
 pSMB-IsFsctl = 1; /* FSCTL */
 pSMB-IsRootFlag = 0;
 pSMB-Fid = fid; /* file handle always le */
 /* 3 byte pad, followed by 2 byte compress state */
 -   pSMB-ByteCount = __constant_cpu_to_le16(5);
 +   pSMB-ByteCount = cpu_to_le16(5);
 inc_rfc1001_len(pSMB, 5);

 rc = SendReceive(xid, tcon-ses, (struct smb_hdr *) pSMB,
 @@ -3386,10 +3386,10 @@ static __u16 ACL_to_cifs_posix(char *parm_data, const 
 char *pACL,
 cifs_acl-version = cpu_to_le16(1);
 if (acl_type == ACL_TYPE_ACCESS) {
 cifs_acl-access_entry_count = cpu_to_le16(count);
 -   cifs_acl-default_entry_count = 
 __constant_cpu_to_le16(0x);
 +   cifs_acl-default_entry_count = cpu_to_le16(0x);
 } else if (acl_type == ACL_TYPE_DEFAULT) {
 cifs_acl-default_entry_count = cpu_to_le16(count);
 -   cifs_acl-access_entry_count = __constant_cpu_to_le16(0x);
 +   cifs_acl-access_entry_count = cpu_to_le16(0x);
 } else {
 cifs_dbg(FYI, unknown ACL type %d\n, acl_type);
 return 0;
 diff --git a/fs/cifs/sess.c b/fs/cifs/sess.c
 index e87387d..27e6175 100644
 --- a/fs/cifs/sess.c
 +++ b/fs/cifs/sess.c
 @@ -46,7 +46,7 @@ static __u32 

Re: [PATCH 1/1] FS/CIFS: remove obsolete __constant

2014-05-05 Thread Jeff Layton
On Mon, 5 May 2014 06:53:01 +0200
Fabian Frederick f...@skynet.be wrote:

 On Sun, 4 May 2014 18:52:43 -0400
 Jeff Layton jlay...@poochiereds.net wrote:
 
  On Sat, May 3, 2014 at 4:15 PM, Fabian Frederick f...@skynet.be wrote:
   Replacing all __constant_foo to foo()
   except in smb2status.h (1700 lines to update).
  
   Cc: linux-c...@vger.kernel.org
   Cc: Steve French sfre...@samba.org
   Cc: Andrew Morton a...@linux-foundation.org
   Signed-off-by: Fabian Frederick f...@skynet.be
   ---
fs/cifs/cifsacl.c  |  2 +-
fs/cifs/cifssmb.c  | 20 ++--
fs/cifs/sess.c |  2 +-
fs/cifs/smb2misc.c | 38 +++---
fs/cifs/smb2ops.c  |  2 +-
fs/cifs/smb2pdu.c  |  2 +-
fs/cifs/smb2pdu.h  | 28 ++--
7 files changed, 47 insertions(+), 47 deletions(-)
  
   diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
   index 7ff866d..54ac0e8 100644
   --- a/fs/cifs/cifsacl.c
   +++ b/fs/cifs/cifsacl.c
   @@ -38,7 +38,7 @@ static const struct cifs_sid sid_everyone = {
   1, 1, {0, 0, 0, 0, 0, 1}, {0} };
/* security id for Authenticated Users system group */
static const struct cifs_sid sid_authusers = {
   -   1, 1, {0, 0, 0, 0, 0, 5}, {__constant_cpu_to_le32(11)} };
   +   1, 1, {0, 0, 0, 0, 0, 5}, {cpu_to_le32(11)} };
  
  
  Does the build still work on BE arches with this? I know at one point
  the above wouldn't compile on those
  arches. See commit 536abdb0802f, for an explanation.
 
 Untested on BE but it would mean checkpatch 
 don't use __constantfoo functions outside of include/uapi/ stuff is wrong 
 ?

I'm not sure. Maybe the macros have been fixed so that it's no longer a
problem? I think it would be good to test build this on a BE arch
before it's merged to be certain though.

IIRC, I originally saw the problem on s390 builds.

-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: nfs oom on 3.15

2014-05-05 Thread Jeff Layton
On Mon, 5 May 2014 02:42:35 -0700 (PDT)
David Rientjes rient...@google.com wrote:

 On Mon, 5 May 2014, Marc Dietrich wrote:
 
  Hi,
  
  this is spaming my log quite a bit.
  
  Any idea?
  
 
 I think rpc_malloc() really wants to be __GFP_NOWARN so that the warnings 
 are suppressed, doing order-1 allocations with GFP_NOWAIT is never 
 guaranteed to succeed and will have difficulty if memory is fragmented.  


Yes. We made a similar fix to xprt_alloc_slot a while back. The RPC
engine has a mechanism to handle retrying these allocations, so this
warning isn't particularly helpful.

-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] locks: add some tracepoints in the lease handling code

2014-05-05 Thread Jeff Layton
v2: add a __break_lease tracepoint for non-blocking case

Recently, I needed these to help track down a softlockup when recalling a
delegation, but they might be helpful in other situations as well.

Cc: J. Bruce Fields bfie...@fieldses.org
Signed-off-by: Jeff Layton jlay...@poochiereds.net
---
 fs/locks.c  | 10 +
 include/trace/events/filelock.h | 93 +
 2 files changed, 103 insertions(+)
 create mode 100644 include/trace/events/filelock.h

diff --git a/fs/locks.c b/fs/locks.c
index 6b01d5a1de4a..2367f3118662 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -130,6 +130,9 @@
 #include linux/percpu.h
 #include linux/lglock.h
 
+#define CREATE_TRACE_POINTS
+#include trace/events/filelock.h
+
 #include asm/uaccess.h
 
 #define IS_POSIX(fl)   (fl-fl_flags  FL_POSIX)
@@ -1386,6 +1389,7 @@ int __break_lease(struct inode *inode, unsigned int mode, 
unsigned int type)
}
 
if (i_have_this_lease || (mode  O_NONBLOCK)) {
+   trace_break_lease_noblock(inode, new_fl);
error = -EWOULDBLOCK;
goto out;
}
@@ -1397,10 +1401,12 @@ restart:
if (break_time == 0)
break_time++;
locks_insert_block(flock, new_fl);
+   trace_break_lease_block(inode, new_fl);
spin_unlock(inode-i_lock);
error = wait_event_interruptible_timeout(new_fl-fl_wait,
!new_fl-fl_next, break_time);
spin_lock(inode-i_lock);
+   trace_break_lease_unblock(inode, new_fl);
locks_delete_block(new_fl);
if (error = 0) {
if (error == 0)
@@ -1522,6 +1528,8 @@ static int generic_add_lease(struct file *filp, long arg, 
struct file_lock **flp
int error;
 
lease = *flp;
+   trace_generic_add_lease(inode, lease);
+
/*
 * In the delegation case we need mutual exclusion with
 * a number of operations that take the i_mutex.  We trylock
@@ -1611,6 +1619,8 @@ static int generic_delete_lease(struct file *filp, struct 
file_lock **flp)
struct dentry *dentry = filp-f_path.dentry;
struct inode *inode = dentry-d_inode;
 
+   trace_generic_delete_lease(inode, *flp);
+
for (before = inode-i_flock;
((fl = *before) != NULL)  IS_LEASE(fl);
before = fl-fl_next) {
diff --git a/include/trace/events/filelock.h b/include/trace/events/filelock.h
new file mode 100644
index ..d1e9d3eaa8ce
--- /dev/null
+++ b/include/trace/events/filelock.h
@@ -0,0 +1,93 @@
+/*
+ * Events for filesystem locks
+ *
+ * Copyright 2013 Jeff Layton jlay...@poochiereds.net
+ */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM filelock
+
+#if !defined(_TRACE_FILELOCK_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_FILELOCK_H
+
+#include linux/tracepoint.h
+#include linux/fs.h
+#include linux/device.h
+#include linux/kdev_t.h
+
+#define show_fl_flags(val) \
+   __print_flags(val, |, \
+   { FL_POSIX, FL_POSIX },   \
+   { FL_FLOCK, FL_FLOCK },   \
+   { FL_DELEG, FL_DELEG },   \
+   { FL_ACCESS,FL_ACCESS },  \
+   { FL_EXISTS,FL_EXISTS },  \
+   { FL_LEASE, FL_LEASE },   \
+   { FL_CLOSE, FL_CLOSE },   \
+   { FL_SLEEP, FL_SLEEP },   \
+   { FL_DOWNGRADE_PENDING, FL_DOWNGRADE_PENDING },   \
+   { FL_UNLOCK_PENDING,FL_UNLOCK_PENDING },  \
+   { FL_OFDLCK,FL_OFDLCK })
+
+#define show_fl_type(val)  \
+   __print_symbolic(val,   \
+   { F_RDLCK, F_RDLCK }, \
+   { F_WRLCK, F_WRLCK }, \
+   { F_UNLCK, F_UNLCK })
+
+DECLARE_EVENT_CLASS(filelock_lease,
+
+   TP_PROTO(struct inode *inode, struct file_lock *fl),
+
+   TP_ARGS(inode, fl),
+
+   TP_STRUCT__entry(
+   __field(struct file_lock *, fl)
+   __field(unsigned long, i_ino)
+   __field(dev_t, s_dev)
+   __field(struct file_lock *, fl_next)
+   __field(fl_owner_t, fl_owner)
+   __field(unsigned int, fl_flags)
+   __field(unsigned char, fl_type)
+   __field(unsigned long, fl_break_time)
+   __field(unsigned long, fl_downgrade_time)
+   ),
+
+   TP_fast_assign(
+   __entry-fl = fl;
+   __entry-s_dev = inode-i_sb-s_dev;
+   __entry-i_ino = inode-i_ino;
+   __entry-fl_next = fl-fl_next;
+   __entry-fl_owner

Re: linux-3.14 nfsd regression

2014-04-03 Thread Jeff Layton
On Thu, 03 Apr 2014 13:51:06 -0400
Mark Lord ml...@pobox.com wrote:

 On 14-04-03 01:16 PM, J. Bruce Fields wrote:
  On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
  This commit from linux-3.14 breaks our NFS-root clients here:
 
  https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
 
 
  - *p++ = htonl((u32) stat-mode);
  + *p++ = htonl((u32) (stat-mode  S_IALLUGO));
 
 
  Reverting the one-liner above (on the server) fixes it for us,
  as does reverting back to linux-3.13.8 on the server.
 
  The NFS-root clients are on PowerPC (big-endian) architecture,
  running linux-3.12.16. The NFS server is on an Intel PC running linux-3.14.
 
  ACL is completely disabled on server and client,
  and we're using NFSv2/v3.  No support for v4.
 
  I instrumented the function to see what other bits were being cleared
  by the (stat-mode  S_IALLUGO) masking.  The results are attached.
  
  Hm, it sounds like a bug in the client if it's depending on those high
  bits.
 
 But only for mounting / starting up from the nfsroot, it seems.
 I wonder if there's an unusual code path for that in there?
 The regular stuff looks mostly fine:
 
 p = xdr_decode_ftype3(p, fmode);
 fattr-mode = (be32_to_cpup(p++)  ~S_IFMT) | fmode;
 
 Except perhaps that second line ought to use the same mask
 as the server side is using, just in case there are some other
 stray high (higher than S_IFMT) bits in there now/someday.
 
  The original behavior was in practice harmless and changing it broke
  something, so I think we should definitely just revert this patch.
 
 Yup.  Who?
 
  But the client may need fixing too.
 
 Probably a good thing in the longer term, for better compatibility
 with non-Linux servers.  But we'll still have to keep the revert
 on the server (nfsd) code for backward compatibility, I think.
 
 Cheers
 

It would be good to understand where this is broken in the client.

It's incorrect for the client to interpret those bits, as I think that
there's no guarantee that other OS's implement the type bits in the
same way that Linux does. So if you end up mounting a different OS,
it's possible that the client will get that wrong...

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-3.14 nfsd regression

2014-04-03 Thread Jeff Layton
On Thu, 3 Apr 2014 16:16:27 -0400
J. Bruce Fields bfie...@fieldses.org wrote:

 On Thu, Apr 03, 2014 at 02:55:04PM -0400, Jeff Layton wrote:
  On Thu, 03 Apr 2014 13:51:06 -0400
  Mark Lord ml...@pobox.com wrote:
  
   On 14-04-03 01:16 PM, J. Bruce Fields wrote:
On Thu, Apr 03, 2014 at 12:33:55PM -0400, Mark Lord wrote:
This commit from linux-3.14 breaks our NFS-root clients here:
   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e14b46b91fee8a049b0940333ce13a820beaaa5
   
   
- *p++ = htonl((u32) stat-mode);
+ *p++ = htonl((u32) (stat-mode  S_IALLUGO));
   
   
Reverting the one-liner above (on the server) fixes it for us,
as does reverting back to linux-3.13.8 on the server.
   
The NFS-root clients are on PowerPC (big-endian) architecture,
running linux-3.12.16. The NFS server is on an Intel PC running 
linux-3.14.
   
ACL is completely disabled on server and client,
and we're using NFSv2/v3.  No support for v4.
   
I instrumented the function to see what other bits were being cleared
by the (stat-mode  S_IALLUGO) masking.  The results are attached.

Hm, it sounds like a bug in the client if it's depending on those high
bits.
   
   But only for mounting / starting up from the nfsroot, it seems.
   I wonder if there's an unusual code path for that in there?
   The regular stuff looks mostly fine:
   
   p = xdr_decode_ftype3(p, fmode);
   fattr-mode = (be32_to_cpup(p++)  ~S_IFMT) | fmode;
   
   Except perhaps that second line ought to use the same mask
   as the server side is using, just in case there are some other
   stray high (higher than S_IFMT) bits in there now/someday.
   
The original behavior was in practice harmless and changing it broke
something, so I think we should definitely just revert this patch.
   
   Yup.  Who?
   
But the client may need fixing too.
   
   Probably a good thing in the longer term, for better compatibility
   with non-Linux servers.  But we'll still have to keep the revert
   on the server (nfsd) code for backward compatibility, I think.
   
   Cheers
   
  
  It would be good to understand where this is broken in the client.
  
  It's incorrect for the client to interpret those bits, as I think that
  there's no guarantee that other OS's implement the type bits in the
  same way that Linux does. So if you end up mounting a different OS,
  it's possible that the client will get that wrong...
 
 It turns out these bits actually are defined in rfc 1094, so this is
 just an odd NFSv2-specific wart, and the nfsd change was just flat-out
 wrong.
 
 --b.

Ahh right -- I remember seeing that long ago.

So according to the RFC you have to encode both the mode bits and the
ftype for v2. The type bits seem to be removed from the mode in NFSv3
though, so perhaps we should only be doing that masking in versions
above v2?

With a quick check, it looks like the v3 code doesn't rely on those bits
and I imagine v4 doesn't either.

It might also be nice to have the client v2 decode_fattr function to
throw a warning if the server sends us mismatched type bits and ftype
values. That would have helped us catch this sooner...

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-3.14 nfsd regression

2014-04-04 Thread Jeff Layton
On Fri, 4 Apr 2014 09:58:43 -0400
J. Bruce Fields bfie...@fieldses.org wrote:

 On Thu, Apr 03, 2014 at 07:21:46PM -0400, Jeff Layton wrote:
  So according to the RFC you have to encode both the mode bits and the
  ftype for v2. The type bits seem to be removed from the mode in NFSv3
  though, so perhaps we should only be doing that masking in versions
  above v2?
 
 Right, the problematic patch applied the same mask in both v2 and v3
 cases, so I'm reverting just the v2 part (see below).
 
  With a quick check, it looks like the v3 code doesn't rely on those bits
  and I imagine v4 doesn't either.
  
  It might also be nice to have the client v2 decode_fattr function to
  throw a warning if the server sends us mismatched type bits and ftype
  values. That would have helped us catch this sooner...
 
 Yes, that might be a reasonable thing to do, though I don't know if it's
 worth it.
 
 --b.
 
 commit 35a8dff14e76c00e5b52140290cfb498dc2454a0
 Author: J. Bruce Fields bfie...@redhat.com
 Date:   Thu Apr 3 15:10:35 2014 -0400
 
 nfsd: revert v2 half of nfsd: don't return high mode bits
 
 This reverts the part of commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
 that changes NFSv2 behavior.
 
 Mark Lord found that it broke nfs-root for Linux clients, because it
 broke NFSv2.
 
 In fact, from RFC 1094:
 
   Notice that the file type is specified both in the mode bits
   and in the file type.  This is really a bug in the protocol and
   will be fixed in future versions.
 
 So NFSv2 clients really are expected to depend on the high bits of the
 mode.
 
 Cc: sta...@kernel.org
 Reported-by: Mark Lord ml...@pobox.com
 Signed-off-by: J. Bruce Fields bfie...@redhat.com
 
 diff --git a/fs/nfsd/nfsxdr.c b/fs/nfsd/nfsxdr.c
 index b17d932..9c769a4 100644
 --- a/fs/nfsd/nfsxdr.c
 +++ b/fs/nfsd/nfsxdr.c
 @@ -152,7 +152,7 @@ encode_fattr(struct svc_rqst *rqstp, __be32 *p, struct 
 svc_fh *fhp,
   type = (stat-mode  S_IFMT);
  
   *p++ = htonl(nfs_ftypes[type  12]);
 - *p++ = htonl((u32) (stat-mode  S_IALLUGO));
 + *p++ = htonl((u32) stat-mode);
   *p++ = htonl((u32) stat-nlink);
   *p++ = htonl((u32) from_kuid(init_user_ns, stat-uid));
   *p++ = htonl((u32) from_kgid(init_user_ns, stat-gid));

Looks right...

Reviewed-by: Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] please pull file locking changes for 3.15

2014-04-04 Thread Jeff Layton
On Fri, 4 Apr 2014 14:28:16 -0700
Linus Torvalds torva...@linux-foundation.org wrote:

 On Mon, Mar 31, 2014 at 6:31 AM, Jeff Layton jlay...@redhat.com wrote:
  The following changes since commit 29723adee11804b548903ddb1db666cf4a60f60e:
 
locks: make locks_mandatory_area check for file-private locks (2014-03-31 
  08:24:43 -0400)
 
  are available in the git repository at:
 
git://git.samba.org/jlayton/linux.git locks-3.15
 
 Please please please use the proper scripts (or improve on whatever
 local script you use) that show the diffstat (git diff -M --stat
 --summary) and shortlog for the pull request too.
 
 I want to know roughly what I'm pulling *before* I pull it, and while
 I can do a two-stage thing where I first fetch-and-inspect it, that's
 annoying. Your human-readable Highlights thing is good, and I want
 that too (I put commentary like that into the merge commit itself),
 but I do want that diffstat/shortlog in addition to the human-readable
 summary.
 
 Because having the diffstat in the pull request email not only gets me
 a heasd-up abotu what to expect, it also confirms that I'm on the same
 page as you are. It acts as a sanity-check for me when I can compare
 the diffstat you *claim* I should get with the diffstat I actually get
 after merging. So I really want to get that diffstat and shortlog as a
 sanity check.
 
 I also would prefer signed tags. It's not a must since it's not like
 git.samba.org is some random public site (I do _require_ them for pull
 requests from github etc), but it's definitely a good thing to have.
 
 Anyway, I'm going through my pile of filesystem pull requests, and
 I've pulled this into my tree. I'm just going through allmodconfig
 builds etc before pushing my merge out, so assuming that all works
 fine you don't need to resend this one.  But for future pull requests,
 please do try to fix the above up.
 
  Linus

Thanks for the feedback and for pulling this in anyway. I'll make sure
to do all of that on subsequent pull requests.

Cheers!
-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: NFSv2 directory cannot longer be exported in 3.14, 3.13 works fine

2014-04-05 Thread Jeff Layton
On Sat, 05 Apr 2014 14:17:27 +0200
Toralf Förster toralf.foers...@gmx.de wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256
 
 On 04/04/2014 11:57 PM, Toralf Förster wrote:
  At a user mode linux image (32 bit Gentoo Linux) the recent kernel gives 
  this error while trying to mount a NFSv2 directory:
  mount.nfs: mount point /mnt/nfsv2 is not a directory
  
 
 A bisect points to :
 
 6e14b46b91fee8a049b0940333ce13a820beaaa5 is the first bad commit
 commit 6e14b46b91fee8a049b0940333ce13a820beaaa5
 Author: Albert Fluegel a...@muc.de
 Date:   Mon Nov 18 12:18:01 2013 -0500
 
 nfsd: don't return high mode bits
 
 The Linux NFS server replies among other things to a Check access 
 permission
 the following:
 
 NFS:File type = 2 (Directory)
 NFS:Mode = 040755
 
 A netapp server replies here:
 NFS:File type = 2 (Directory)
 NFS:Mode = 0755
 
 The RFC 1813 i read:
fattr3
 
   struct fattr3 {
  ftype3 type;
  mode3  mode;
  uint32 nlink;
 ...
 For the mode bits only the lowest 9 are defined in the RFC
 
 As far as I can tell, knfsd has always done this, so apparently it's 
 harmless.
 Nevertheless, it appears to be wrong.
 
 Note this is already correct in the NFSv4 case, only v2 and v3 need
 fixing.
 
 Signed-off-by: J. Bruce Fields bfie...@redhat.com
 
 :04 04 a60e55f5a9749db3977e12e8baa08d45fadba225 
 b982c02ec1069c5d969ccb71197e5569fe93b075 M fs
 
 - -- 
 MfG/Sincerely
 Toralf Förster
 pgp finger print:1A37 6F99 4A9D 026F 13E2 4DCF C4EA CDDE 0076 E94E
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.22 (GNU/Linux)
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
 
 iF4EAREIAAYFAlM/9FcACgkQxOrN3gB26U5F1AD+JAP4cDM+rT9nk9Q8q681RywT
 Ejxq7d0tnVUfWyDFDTgA/1sIUtpv9GSK8SeAJFu7oAhMt4e6eeIdbpMT2qUEQv6q
 =Sles
 -END PGP SIGNATURE-
 --
 To unsubscribe from this list: send the line unsubscribe linux-nfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

This was already tracked down earlier this week. Bruce has a patch
queued to fix it in his nfsd-next branch. See this thread from earlier
this week:

Subject: Re: linux-3.14 nfsd regression

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] please pull file locking changes for 3.15

2014-04-07 Thread Jeff Layton
On Mon, 7 Apr 2014 11:37:06 +1000
Stephen Rothwell s...@canb.auug.org.au wrote:

 Hi Jeff,
 
 On Fri, 4 Apr 2014 20:56:24 -0400 Jeff Layton jlay...@redhat.com wrote:
 
  Thanks for the feedback and for pulling this in anyway. I'll make sure
  to do all of that on subsequent pull requests.
 
 Also, please don't rebase what you have in linux-next before sending it to
 Linus (without good reason).  Especially if you then leave the linux-next
 included branch as it was - since that may cause conflicts in my tree
 (and so I notice the rebase).
 

I don't think I did that though, did I? Both the branches I had in this
case were based on 3.13-rc1.

Now that Linus has pulled in the changes, am I OK to rebase the
branches (or do a pull)?

-- 
Jeff Layton jlay...@redhat.com


signature.asc
Description: PGP signature


Re: [GIT PULL] please pull file locking changes for 3.15

2014-04-07 Thread Jeff Layton
On Mon, 7 Apr 2014 22:27:02 +1000
Stephen Rothwell s...@canb.auug.org.au wrote:

 Hi Jeff,
 
 On Mon, 7 Apr 2014 07:11:30 -0400 Jeff Layton jlay...@redhat.com wrote:
 
  On Mon, 7 Apr 2014 11:37:06 +1000
  Stephen Rothwell s...@canb.auug.org.au wrote:
  
   On Fri, 4 Apr 2014 20:56:24 -0400 Jeff Layton jlay...@redhat.com wrote:
   
Thanks for the feedback and for pulling this in anyway. I'll make sure
to do all of that on subsequent pull requests.
   
   Also, please don't rebase what you have in linux-next before sending it to
   Linus (without good reason).  Especially if you then leave the linux-next
   included branch as it was - since that may cause conflicts in my tree
   (and so I notice the rebase).
  
  I don't think I did that though, did I? Both the branches I had in this
  case were based on 3.13-rc1.
 
 OK, so the base stayed the same, but you recommitted all the same patches
 (I didn't check to see if the commit messages changed) which due to some
 other change caused conflicts in linux-next today :-( Not really a
 biggie, but generally, you really shouldn't rewrite your commits just
 before sending them to Linus.
 
  Now that Linus has pulled in the changes, am I OK to rebase the
  branches (or do a pull)?
 
 At this point, you should just reset your next included branch to be in
 Linus' tree beyond where Linus merged your tree, then it will be
 effectively empty (unless you then add some more patches on top - but at
 this point those should only be fixes).
 

Ahh ok, I guess I *did* alter a commit message in my locks-3.15 branch
and didn't propagate that to my linux-next branch. That probably threw
off the commit IDs. The patches themselves should have been identical
though. Sorry about that!

In any case, I've gone ahead and reset my branch as you recommend and
linux-next branch and will be more careful about that in the future.

-- 
Jeff Layton jlay...@redhat.com


signature.asc
Description: PGP signature


Re: [PATCH] NFS/RDMA: Document separate Kconfig symbols

2014-04-09 Thread Jeff Layton
On Wed, 09 Apr 2014 16:05:53 +0200
Paul Bolle pebo...@tiscali.nl wrote:

 The NFS/RDMA Kconfig symbol was split into separate options for client
 and server in commit 2e8c12e1b765 (xprtrdma: add separate Kconfig
 options for NFSoRDMA client and server support). Update the
 documentation to reflect this split.
 
 Signed-off-by: Paul Bolle pebo...@tiscali.nl
 ---
 0) Should Documentation/ describe the current release, or the current
 and previous releases? For these paragraphs I choose only the current
 release.
 

I think they're only relevant for the tree they're in. The docs are shipped
with the kernel sources, after all, so presumably someone who is
looking at these has docs that match their kernel (my omission
notwithstanding, of course)

 1) Another approach could be to not document the Kconfig setup at all,
 because the Kconfig system should, in theory, provide all help needed to
 correctly configure the build for (in this case) NFS/RDMA. But is that
 true?
 

Yeah, it does seem somewhat redundant.

 2) By the way: what's the purpose of INFINIBAND_ADDR_TRANS?
 
  Documentation/filesystems/nfs/nfs-rdma.txt | 16 +---
  1 file changed, 9 insertions(+), 7 deletions(-)
 
 diff --git a/Documentation/filesystems/nfs/nfs-rdma.txt 
 b/Documentation/filesystems/nfs/nfs-rdma.txt
 index e386f7e4bcee..724043858b08 100644
 --- a/Documentation/filesystems/nfs/nfs-rdma.txt
 +++ b/Documentation/filesystems/nfs/nfs-rdma.txt
 @@ -138,9 +138,9 @@ Installation
- Build, install, reboot
  
  The NFS/RDMA code will be enabled automatically if NFS and RDMA
 -are turned on. The NFS/RDMA client and server are configured via the 
 hidden
 -SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The
 -value of SUNRPC_XPRT_RDMA will be:
 +are turned on. The NFS/RDMA client and server are configured via the
 +SUNRPC_XPRT_RDMA_CLIENT and SUNRPC_XPRT_RDMA_SERVER config options that 
 both
 +depend on SUNRPC and INFINIBAND. The default value of both options will 
 be:
  
   - N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA 
 client
 and server will not be built
 @@ -235,8 +235,9 @@ NFS/RDMA Setup
  
- Start the NFS server
  
 -If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m 
 in
 -kernel config), load the RDMA transport module:
 +If the NFS/RDMA server was built as a module
 +(CONFIG_SUNRPC_XPRT_RDMA_SERVER=m in kernel config), load the RDMA
 +transport module:
  
  $ modprobe svcrdma
  
 @@ -255,8 +256,9 @@ NFS/RDMA Setup
  
- On the client system
  
 -If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m 
 in
 -kernel config), load the RDMA client module:
 +If the NFS/RDMA client was built as a module
 +(CONFIG_SUNRPC_XPRT_RDMA_CLIENT=m in kernel config), load the RDMA client
 +module:
  
  $ modprobe xprtrdma.ko
  

Thanks, nice catch, I had forgotten about the docs...

Reviewed-by: Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][glibc PATCH] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-23 Thread Jeff Layton
On Wed, 23 Apr 2014 15:00:06 -0400
a...@gnu.org (Alfred M. Szmidt) wrote:

 Likewise.  You infact write that it does get the lock information
 later in the document wrt. F_OFD_GETLK.
 
Sorry, I disagree here...GETLK is really a misnomer, IMO. TESTLK
would have been a better name.
 
GETLK are used is to get the first lock.
 
It's a way to test whether a particular lock can be applied, and to
return information about a conflicting lock if it can't. If, for
instance there is no conflicting lock, then you don't get any
lock information back (l_type just gets reset to F_UNLCK).
 
 While I kinda see your point, it isn't what GETLK does; it really does
 get you information about the first lock -- you're not testing
 anything.  It is also the terminology used in the POSIX standard.


You *are* testing a lock.

For instance, a process has locked bytes 0-5 for read in the file. I
then submit a F_GETLK request from another process and set:

l_type = F_WRLCK
l_start = 7
l_len = 1

...this range does not overlap with the original range, and so no lock
will be returned even though one is being held on the file. In order to
determine whether it should return information about a lock it has to
first _test_ whether it conflicts with the information in the struct
flock that was passed down.

Similarly, if the struct flock I submit to the F_GETLK request has this:

   l_type = F_RDLCK
   l_start = 0
   l_len = 1

...then I also will not get any information about a lock back. The
information in the lock request does not conflict with the one being
held on the file (because they are both read locks).

If F_GETLK were just getting a lock, then there would be no test
involved, but that's not how this works. F_GETLK has to test and see
whether there is a conflicting lock before it can return anything.

If all you're objecting to is the change in verbiage on those two
pieces, then I'll back that part out in the interest of wrapping this
up.

I still think I'm correct though ;)

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] please pull file locking changes for v3.15 (pile #2)

2014-04-24 Thread Jeff Layton

The following changes since commit 55101e2d6ce1c780f6ee8fee5f37306971aac6cd:

  Merge git://git.kernel.org/pub/scm/virt/kvm/kvm (2014-04-14 16:21:28 -0700)

are available in the git repository at:


  git://git.samba.org/jlayton/linux.git tags/locks-v3.15-2

for you to fetch changes up to cff2fce58b2b0f59089e7edcdc38803d65057b9f:

  locks: rename FL_FILE_PVT and IS_FILE_PVT to use *_OFDLCK instead 
(2014-04-23 16:17:03 -0400)


File locking related bugfixes for v3.15 (pile #2)

- fix for a long-standing bug in __break_lease that can cause soft lockups
- renaming of file-private locks to open file description locks, and the
  command macros to more visually distinct names.

The fix for __break_lease is also in the pile of patches for which Bruce
sent a pull request, but I assume that your merge procedure will handle
that correctly.

For the other patches, I don't like the fact that we need to rename this
stuff at this late stage, but it should be settled now (hopefully).


Jeff Layton (3):
  locks: allow __break_lease to sleep even when break_time is 0
  locks: rename file-private locks to open file description locks
  locks: rename FL_FILE_PVT and IS_FILE_PVT to use *_OFDLCK instead

 arch/arm/kernel/sys_oabi-compat.c |  6 +++---
 fs/compat.c   | 14 +++---
 fs/fcntl.c| 12 ++--
 fs/locks.c| 55 
+++
 include/linux/fs.h|  2 +-
 include/uapi/asm-generic/fcntl.h  | 20 ++--
 security/selinux/hooks.c  |  6 +++---
 7 files changed, 57 insertions(+), 58 deletions(-)

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[glibc PATCH v2] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@redhat.com

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..764f67d469cc 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test a open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear a file lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but wait for completion.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated with the process.
+There is also a different type of record lock that is associated with the
+open file description instead of the process.  @xref{Open File Description 
Locks

Re: [glibc PATCH v2] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
On Thu, 24 Apr 2014 14:31:43 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 Jeff,
 
 Did you receive my mail with comments on the previous patch?
 (I got no reply.) It looks like some of those comments that needed to
 be addressed were not.
 
 Cheers,
 
 Michael
 

I did get the mail and thought I had addressed them all. The only thing
I didn't change based on your comments was the one about the return 0
in main() being unnecessary in the example program. That's not true
since main is an int return function. I suppose we could turn it into a
void return, but does it really matter here?

 On 04/24/2014 02:15 PM, Jeff Layton wrote:
  Open file description locks have been merged into the Linux kernel for
  v3.15.  Add the appropriate command-value definitions and an update to
  the manual that describes their usage.
  
  ChangeLog:
  
  2014-04-24  Jeff Layton  jlay...@redhat.com
  
  [BZ#16839]
  * manual/llio.texi: add section about open file description locks
  
  * sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
(F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
  ---
   manual/examples/ofdlocks.c |  77 +
   manual/llio.texi   | 241 
  -
   sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
   3 files changed, 332 insertions(+), 3 deletions(-)
   create mode 100644 manual/examples/ofdlocks.c
  
  diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
  new file mode 100644
  index ..85e193cdabe6
  --- /dev/null
  +++ b/manual/examples/ofdlocks.c
  @@ -0,0 +1,77 @@
  +/* Open File Description Locks Usage Example
  +   Copyright (C) 1991-2014 Free Software Foundation, Inc.
  +
  +   This program is free software; you can redistribute it and/or
  +   modify it under the terms of the GNU General Public License
  +   as published by the Free Software Foundation; either version 2
  +   of the License, or (at your option) any later version.
  +
  +   This program is distributed in the hope that it will be useful,
  +   but WITHOUT ANY WARRANTY; without even the implied warranty of
  +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  +   GNU General Public License for more details.
  +
  +   You should have received a copy of the GNU General Public License
  +   along with this program; if not, see http://www.gnu.org/licenses/.
  +*/
  +
  +#define _GNU_SOURCE
  +#include stdio.h
  +#include sys/types.h
  +#include sys/stat.h
  +#include unistd.h
  +#include fcntl.h
  +#include pthread.h
  +
  +#define FILENAME   /tmp/foo
  +#define NUM_THREADS3
  +#define ITERATIONS 5
  +
  +void *
  +thread_start (void *arg)
  +{
  +  int i, fd, len;
  +  long tid = (long) arg;
  +  char buf[256];
  +  struct flock lck = {
  +.l_whence = SEEK_SET,
  +.l_start = 0,
  +.l_len = 1,
  +  };
  +
  +  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
  +
  +  for (i = 0; i  ITERATIONS; i++)
  +{
  +  lck.l_type = F_WRLCK;
  +  fcntl (fd, F_OFD_SETLKW, lck);
  +
  +  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
  +
  +  lseek (fd, 0, SEEK_END);
  +  write (fd, buf, len);
  +  fsync (fd);
  +
  +  lck.l_type = F_UNLCK;
  +  fcntl (fd, F_OFD_SETLK, lck);
  +
  +  /* sleep to ensure lock is yielded to another thread */
  +  usleep (1);
  +}
  +  pthread_exit (NULL);
  +}
  +
  +int
  +main (int argc, char **argv)
  +{
  +  long i;
  +  pthread_t threads[NUM_THREADS];
  +
  +  truncate (FILENAME, 0);
  +
  +  for (i = 0; i  NUM_THREADS; i++)
  +pthread_create (threads[i], NULL, thread_start, (void *) i);
  +
  +  pthread_exit (NULL);
  +  return 0;
  +}
  diff --git a/manual/llio.texi b/manual/llio.texi
  index 6f8adfc607d7..764f67d469cc 100644
  --- a/manual/llio.texi
  +++ b/manual/llio.texi
  @@ -57,6 +57,10 @@ directly.)
flags associated with open files.
   * File Locks::  Fcntl commands for implementing
file locking.
  +* Open File Description Locks:: Fcntl commands for implementing
  + open file description locking.
  +* Open File Description Locks Example:: An example of open file 
  description lock
  + usage
   * Interrupt Input:: Getting an asynchronous signal when
input arrives.
   * IOCTLs::  Generic I/O Control operations.
  @@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
  Status Flags}.
   Set flags associated with the open file.  @xref{File Status Flags}.
   
   @item F_GETLK
  -Get a file lock.  @xref{File Locks}.
  +Test a file lock.  @xref{File Locks}.
   
   @item F_SETLK
   Set or clear a file lock.  @xref{File Locks}.
  @@ -2898,6 +2902,18 @@ Set or clear a file lock

[glibc PATCH v3] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@redhat.com

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..0e6e387a1bd9 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test a open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear a file lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but wait for completion.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated with the process.
+There is also a different type of record lock that is associated with the
+open file description instead of the process.  @xref{Open File Description 
Locks

[glibc PATCHi v4] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@redhat.com

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated with the process.
+There is also a different type of record lock that is associated with the
+open file description instead of the process.  @xref

Re: [glibc PATCHi v4] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
On Thu, 24 Apr 2014 17:07:11 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 Jeff,
 
 Please let us know what changed from one version to the next. I've no
 idea without looking into the patch itself whether this inorporates my
 comments on the v2 patch...
 
 Cheers,
 
 Michael
 
 

It's a little hard to do that specifically since much of what changed
were trivial s/a/an/ type changes. But basically I just incorporated
your changes and fixed a few other places.

Since this is already so close, what might be best is to just ask the
glibc folks to merge what we have, and then if there are other changes
that are needed later, we can base patches on top of it?

 
 On Thu, Apr 24, 2014 at 4:09 PM, Jeff Layton jlay...@redhat.com wrote:
  Open file description locks have been merged into the Linux kernel for
  v3.15.  Add the appropriate command-value definitions and an update to
  the manual that describes their usage.
 
  ChangeLog:
 
  2014-04-24  Jeff Layton  jlay...@redhat.com
 
  [BZ#16839]
  * manual/llio.texi: add section about open file description locks
 
  * sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
(F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
  ---
   manual/examples/ofdlocks.c |  77 +
   manual/llio.texi   | 241 
  -
   sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
   3 files changed, 332 insertions(+), 3 deletions(-)
   create mode 100644 manual/examples/ofdlocks.c
 
  diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
  new file mode 100644
  index ..85e193cdabe6
  --- /dev/null
  +++ b/manual/examples/ofdlocks.c
  @@ -0,0 +1,77 @@
  +/* Open File Description Locks Usage Example
  +   Copyright (C) 1991-2014 Free Software Foundation, Inc.
  +
  +   This program is free software; you can redistribute it and/or
  +   modify it under the terms of the GNU General Public License
  +   as published by the Free Software Foundation; either version 2
  +   of the License, or (at your option) any later version.
  +
  +   This program is distributed in the hope that it will be useful,
  +   but WITHOUT ANY WARRANTY; without even the implied warranty of
  +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  +   GNU General Public License for more details.
  +
  +   You should have received a copy of the GNU General Public License
  +   along with this program; if not, see http://www.gnu.org/licenses/.
  +*/
  +
  +#define _GNU_SOURCE
  +#include stdio.h
  +#include sys/types.h
  +#include sys/stat.h
  +#include unistd.h
  +#include fcntl.h
  +#include pthread.h
  +
  +#define FILENAME   /tmp/foo
  +#define NUM_THREADS3
  +#define ITERATIONS 5
  +
  +void *
  +thread_start (void *arg)
  +{
  +  int i, fd, len;
  +  long tid = (long) arg;
  +  char buf[256];
  +  struct flock lck = {
  +.l_whence = SEEK_SET,
  +.l_start = 0,
  +.l_len = 1,
  +  };
  +
  +  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
  +
  +  for (i = 0; i  ITERATIONS; i++)
  +{
  +  lck.l_type = F_WRLCK;
  +  fcntl (fd, F_OFD_SETLKW, lck);
  +
  +  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
  +
  +  lseek (fd, 0, SEEK_END);
  +  write (fd, buf, len);
  +  fsync (fd);
  +
  +  lck.l_type = F_UNLCK;
  +  fcntl (fd, F_OFD_SETLK, lck);
  +
  +  /* sleep to ensure lock is yielded to another thread */
  +  usleep (1);
  +}
  +  pthread_exit (NULL);
  +}
  +
  +int
  +main (int argc, char **argv)
  +{
  +  long i;
  +  pthread_t threads[NUM_THREADS];
  +
  +  truncate (FILENAME, 0);
  +
  +  for (i = 0; i  NUM_THREADS; i++)
  +pthread_create (threads[i], NULL, thread_start, (void *) i);
  +
  +  pthread_exit (NULL);
  +  return 0;
  +}
  diff --git a/manual/llio.texi b/manual/llio.texi
  index 6f8adfc607d7..864060dc7140 100644
  --- a/manual/llio.texi
  +++ b/manual/llio.texi
  @@ -57,6 +57,10 @@ directly.)
flags associated with open files.
   * File Locks::  Fcntl commands for implementing
file locking.
  +* Open File Description Locks:: Fcntl commands for implementing
  + open file description locking.
  +* Open File Description Locks Example:: An example of open file 
  description lock
  + usage
   * Interrupt Input:: Getting an asynchronous signal when
input arrives.
   * IOCTLs::  Generic I/O Control operations.
  @@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
  Status Flags}.
   Set flags associated with the open file.  @xref{File Status Flags}.
 
   @item F_GETLK
  -Get a file lock.  @xref{File Locks}.
  +Test a file lock.  @xref{File Locks}.
 
   @item

Re: [glibc PATCHi v4] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-24 Thread Jeff Layton
On Thu, 24 Apr 2014 17:34:35 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 On Thu, Apr 24, 2014 at 5:11 PM, Jeff Layton jlay...@redhat.com wrote:
  On Thu, 24 Apr 2014 17:07:11 +0200
  Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:
 
  Jeff,
 
  Please let us know what changed from one version to the next. I've no
  idea without looking into the patch itself whether this inorporates my
  comments on the v2 patch...
 
  Cheers,
 
  Michael
 
 
 
  It's a little hard to do that specifically since much of what changed
  were trivial s/a/an/ type changes. But basically I just incorporated
  your changes and fixed a few other places.
 
 I think you missed my point. At the moment, there is no feedback loop.
 Either reply to my mail or just add a note such as:
 

Ok, fair enough. I'll do that from here on out if there are other
revisions needed.

 * Incorporated changes after Michael Kerrisk's comments on v2 patch
 
 In the next patch revision mail.
 
  Since this is already so close, what might be best is to just ask the
  glibc folks to merge what we have, and then if there are other changes
  that are needed later, we can base patches on top of it?
 
 I don't think I'll have more comments. The text looks pretty close to
 done to me. Good work!
 

Thanks for your patience in reviewing!

Carlos, can you help shepherd this into glibc tree? Are there any other
changes that need to be made before it can be merged?

Thanks,
Jeff


 Cheers,
 
 Michael
 
  On Thu, Apr 24, 2014 at 4:09 PM, Jeff Layton jlay...@redhat.com wrote:
   Open file description locks have been merged into the Linux kernel for
   v3.15.  Add the appropriate command-value definitions and an update to
   the manual that describes their usage.
  
   ChangeLog:
  
   2014-04-24  Jeff Layton  jlay...@redhat.com
  
   [BZ#16839]
   * manual/llio.texi: add section about open file description locks
  
   * sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
 (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
   ---
manual/examples/ofdlocks.c |  77 +
manual/llio.texi   | 241 
   -
sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
3 files changed, 332 insertions(+), 3 deletions(-)
create mode 100644 manual/examples/ofdlocks.c
  
   diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
   new file mode 100644
   index ..85e193cdabe6
   --- /dev/null
   +++ b/manual/examples/ofdlocks.c
   @@ -0,0 +1,77 @@
   +/* Open File Description Locks Usage Example
   +   Copyright (C) 1991-2014 Free Software Foundation, Inc.
   +
   +   This program is free software; you can redistribute it and/or
   +   modify it under the terms of the GNU General Public License
   +   as published by the Free Software Foundation; either version 2
   +   of the License, or (at your option) any later version.
   +
   +   This program is distributed in the hope that it will be useful,
   +   but WITHOUT ANY WARRANTY; without even the implied warranty of
   +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
   +   GNU General Public License for more details.
   +
   +   You should have received a copy of the GNU General Public License
   +   along with this program; if not, see http://www.gnu.org/licenses/.
   +*/
   +
   +#define _GNU_SOURCE
   +#include stdio.h
   +#include sys/types.h
   +#include sys/stat.h
   +#include unistd.h
   +#include fcntl.h
   +#include pthread.h
   +
   +#define FILENAME   /tmp/foo
   +#define NUM_THREADS3
   +#define ITERATIONS 5
   +
   +void *
   +thread_start (void *arg)
   +{
   +  int i, fd, len;
   +  long tid = (long) arg;
   +  char buf[256];
   +  struct flock lck = {
   +.l_whence = SEEK_SET,
   +.l_start = 0,
   +.l_len = 1,
   +  };
   +
   +  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
   +
   +  for (i = 0; i  ITERATIONS; i++)
   +{
   +  lck.l_type = F_WRLCK;
   +  fcntl (fd, F_OFD_SETLKW, lck);
   +
   +  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
   +
   +  lseek (fd, 0, SEEK_END);
   +  write (fd, buf, len);
   +  fsync (fd);
   +
   +  lck.l_type = F_UNLCK;
   +  fcntl (fd, F_OFD_SETLK, lck);
   +
   +  /* sleep to ensure lock is yielded to another thread */
   +  usleep (1);
   +}
   +  pthread_exit (NULL);
   +}
   +
   +int
   +main (int argc, char **argv)
   +{
   +  long i;
   +  pthread_t threads[NUM_THREADS];
   +
   +  truncate (FILENAME, 0);
   +
   +  for (i = 0; i  NUM_THREADS; i++)
   +pthread_create (threads[i], NULL, thread_start, (void *) i);
   +
   +  pthread_exit (NULL);
   +  return 0;
   +}
   diff --git a/manual/llio.texi b/manual/llio.texi
   index 6f8adfc607d7..864060dc7140 100644
   --- a/manual/llio.texi
   +++ b/manual/llio.texi
   @@ -57,6 +57,10 @@ directly.)
 flags associated

Re: [GIT PULL] please pull file locking changes for v3.15 (pile #2)

2014-04-25 Thread Jeff Layton
On Fri, 25 Apr 2014 05:57:39 -0400
J. Bruce Fields bfie...@fieldses.org wrote:

 On Thu, Apr 24, 2014 at 07:45:57AM -0400, Jeff Layton wrote:
  
  The following changes since commit 55101e2d6ce1c780f6ee8fee5f37306971aac6cd:
  
Merge git://git.kernel.org/pub/scm/virt/kvm/kvm (2014-04-14 16:21:28 
  -0700)
  
  are available in the git repository at:
  
  
git://git.samba.org/jlayton/linux.git tags/locks-v3.15-2
  
  for you to fetch changes up to cff2fce58b2b0f59089e7edcdc38803d65057b9f:
  
locks: rename FL_FILE_PVT and IS_FILE_PVT to use *_OFDLCK instead 
  (2014-04-23 16:17:03 -0400)
  
  
  File locking related bugfixes for v3.15 (pile #2)
  
  - fix for a long-standing bug in __break_lease that can cause soft lockups
  - renaming of file-private locks to open file description locks, and the
command macros to more visually distinct names.
  
  The fix for __break_lease is also in the pile of patches for which Bruce
  sent a pull request, but I assume that your merge procedure will handle
  that correctly.
  
  For the other patches, I don't like the fact that we need to rename this
  stuff at this late stage, but it should be settled now (hopefully).
  
  
  Jeff Layton (3):
locks: allow __break_lease to sleep even when break_time is 0
locks: rename file-private locks to open file description locks
 
 Nit: they're descriptors, not descriptions.
 
 But since that only affects comments and changelogs (not even the macros
 names, which are using OFD), that's a very nitty nit
 
 --b.
 

No, they are _descriptions_. The descriptor is the number you get back
from doing something like an open(). The description is the thing that
the descriptor points to. This is the terminology that POSIX uses, and
is what was recommended by the glibc and manpages maintainers.

The glibc doc patches lay this out in detail. I'm not terribly thrilled
with the name either, fwiw, but it is better than file-private.

locks: rename FL_FILE_PVT and IS_FILE_PVT to use *_OFDLCK instead
  
   arch/arm/kernel/sys_oabi-compat.c |  6 +++---
   fs/compat.c   | 14 +++---
   fs/fcntl.c| 12 ++--
   fs/locks.c| 55 
  +++
   include/linux/fs.h|  2 +-
   include/uapi/asm-generic/fcntl.h  | 20 ++--
   security/selinux/hooks.c  |  6 +++---
   7 files changed, 57 insertions(+), 58 deletions(-)
  
  -- 
  Jeff Layton jlay...@redhat.com


-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. The names and command macros do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.

The consensus on the lists so far is that they should be rechristened as
file-description locks.

This patch makes the following changes that I think are necessary before
v3.15 ships:

1) rename the command macros to their new names. These end up in the uapi
   headers and so are part of the external-facing API. It turns out that
   glibc doesn't actually use the fcntl.h uapi header, but it's hard to
   be sure that something else won't. Changing it now is safest.

2) make the the /proc/locks output display these as type FDLOCK

The rest of the renaming can wait until v3.16, since everything else
isn't visible outside of the kernel.

Cc: Michael Kerrisk mtk.manpa...@gmail.com
Signed-off-by: Jeff Layton jlay...@redhat.com
---
 arch/arm/kernel/sys_oabi-compat.c |  6 +++---
 fs/compat.c   | 14 +++---
 fs/fcntl.c| 12 ++--
 fs/locks.c| 14 +++---
 include/uapi/asm-generic/fcntl.h  | 20 ++--
 security/selinux/hooks.c  |  6 +++---
 6 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c 
b/arch/arm/kernel/sys_oabi-compat.c
index 702bd329d9d0..d92aa1277e40 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -203,9 +203,9 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned 
int cmd,
int ret;
 
switch (cmd) {
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_FD_GETLK:
+   case F_FD_SETLK:
+   case F_FD_SETLKW:
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
diff --git a/fs/compat.c b/fs/compat.c
index ca926ad0430c..4933d5b32ace 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -457,9 +457,9 @@ COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned 
int, cmd,
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_FD_GETLK:
+   case F_FD_SETLK:
+   case F_FD_SETLKW:
ret = get_compat_flock64(f, compat_ptr(arg));
if (ret != 0)
break;
@@ -468,7 +468,7 @@ COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned 
int, cmd,
conv_cmd = convert_fcntl_cmd(cmd);
ret = sys_fcntl(fd, conv_cmd, (unsigned long)f);
set_fs(old_fs);
-   if ((conv_cmd == F_GETLK || conv_cmd == F_GETLKP)  ret == 0) {
+   if ((conv_cmd == F_GETLK || conv_cmd == F_FD_GETLK)  ret == 
0) {
/* need to return lock information - see above for 
commentary */
if (f.l_start  COMPAT_LOFF_T_MAX)
ret = -EOVERFLOW;
@@ -493,9 +493,9 @@ COMPAT_SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned 
int, cmd,
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_FD_GETLK:
+   case F_FD_SETLK:
+   case F_FD_SETLKW:
return -EINVAL;
}
return compat_sys_fcntl64(fd, cmd, arg);
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 9ead1596399a..a62d9ba3874b 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -274,15 +274,15 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned 
long arg,
break;
 #if BITS_PER_LONG != 32
/* 32-bit arches must use fcntl64() */
-   case F_GETLKP:
+   case F_FD_GETLK:
 #endif
case F_GETLK:
err = fcntl_getlk(filp, cmd, (struct flock __user *) arg);
break;
 #if BITS_PER_LONG != 32
/* 32-bit arches must use fcntl64() */
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_FD_SETLK:
+   case F_FD_SETLKW:
 #endif
/* Fallthrough */
case F_SETLK:
@@ -399,13 +399,13 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, 
cmd,

switch (cmd) {
case F_GETLK64:
-   case F_GETLKP:
+   case F_FD_GETLK:
err = fcntl_getlk64(f.file, cmd, (struct flock64 __user *) arg);
break;
case F_SETLK64:
case F_SETLKW64:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_FD_SETLK:
+   case F_FD_SETLKW:
err = fcntl_setlk64(fd, f.file, cmd,
(struct flock64 __user *) arg);
break;
diff --git a/fs/locks.c b/fs/locks.c
index b380f5543614..bfe5df7aa685 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1941,7 +1941,7 @@ int

Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
On Mon, 21 Apr 2014 12:10:04 -0400
Rich Felker dal...@libc.org wrote:

 On Mon, Apr 21, 2014 at 04:23:54PM +0200, Michael Kerrisk (man-pages) wrote:
  On 04/21/2014 04:02 PM, Rich Felker wrote:
   On Mon, Apr 21, 2014 at 09:45:35AM -0400, Jeff Layton wrote:
   File-private locks have been merged into Linux for v3.15, and *now*
   people are commenting that the name and macro definitions for the new
   file-private locks suck.
  
   and I can't even disagree. The names and command macros do suck.
  
   We're going to have to live with these for a long time, so it's
   important that we be happy with the names before we're stuck with them.
  
   The consensus on the lists so far is that they should be rechristened as
   file-description locks.
  
   This patch makes the following changes that I think are necessary before
   v3.15 ships:
  
   1) rename the command macros to their new names. These end up in the uapi
  headers and so are part of the external-facing API. It turns out that
  glibc doesn't actually use the fcntl.h uapi header, but it's hard to
  be sure that something else won't. Changing it now is safest.
  
   2) make the the /proc/locks output display these as type FDLOCK
  
   The rest of the renaming can wait until v3.16, since everything else
   isn't visible outside of the kernel.
   
   I'm sorry I didn't chime in on this earlier, but I really prefer the
   (somewhat bad) current naming (private) to the
   ridiculously-confusing use of FD to mean file descriptION when
   everybody is used to it meaning file descriptOR. The potential for
   confusion that these are file descriptOR locks (they're not) is much
   more of a problem, IMO, than the confusion about what private means
   (since it doesn't have an established meaning in this context.
   
   Thus my vote is for leaving things the way the kernel did it already.
  
  There's at least two problems to solve here:
  
  1) File private locks is _meaningless_ as a term. Elsewhere
 
 That's the benefit of it: it doesn't clash with any
 already-established meaning. I agree it's less than ideal, but all the
 alternatives I've seen so far are worse.
 
 
  (http://thread.gmane.org/gmane.network.samba.internals/76414/focus=1685376),
 I suggested various alternatives. File-handle locks [*] was my
 
 This is also bad. Handle also has a defined meaning in POSIX. See
 XSH 2.5.1:
 
 http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
 

Not to mention that filehandle has a different meaning altogether in
NFS parlance. I think we should avoid handle altogether in the name.

 initial preference, and I also suggested file-description locks
 and noted the drawbacks of that term. I think it's insufficient
 to say stick with the existing poor name--if you have
 something better, then please propose it. (Note by the way
 that for nearly a decade now, the open(2) man page has followed
 POSIX in using the term open file description. Full disclosure:
 of course, I'm responsible for that change in the man page.)
 
 I'm well aware of that. The problem is that the proposed API is using
 the two-letter abbreviation FD, which ALWAYS means file descriptor and
 NEVER means file description (in existing usage) to mean file
 description. That's what's wrong.
 

Fair enough. Assuming we kept file-description locks as a name, what
would you propose as new macro names?

  2) The new API constants (F_SETLKP, F_SETLKPW, F_GETLKP) have names
 that are visually very close to the traditional POSIX lock names 
 (F_SETLK, F_SETLKW, F_GETLK). That's an accident waiting to happen
 when someone mistypes in code and/or misses such a misttyping
 when reading code. That really must be fixed.
 
 I agree, but I don't think making it worse is a solution.
 

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
On Mon, 21 Apr 2014 09:09:27 -0700
Christoph Hellwig h...@infradead.org wrote:

 On Mon, Apr 21, 2014 at 04:23:54PM +0200, Michael Kerrisk (man-pages) wrote:
  
  There's at least two problems to solve here:
  
  1) File private locks is _meaningless_ as a term. Elsewhere
 
  (http://thread.gmane.org/gmane.network.samba.internals/76414/focus=1685376),
 
 It's indeed not a very good choice, but the new name is even worse.
 Just call them non-broken locks? :)  Or not give them a name an just
 append a 2 to the fcntls? :)
 

I think we'll need to give them a name, if only to make it possible to
document this stuff.

I'm in Jeremy's camp on this one. I don't really care what that name
*is*. I just need to know what it is so I can finish up the docs and
make any changes to the interface that are necessary.

  2) The new API constants (F_SETLKP, F_SETLKPW, F_GETLKP) have names
 that are visually very close to the traditional POSIX lock names 
 (F_SETLK, F_SETLKW, F_GETLK). That's an accident waiting to happen
 when someone mistypes in code and/or misses such a misttyping
 when reading code. That really must be fixed.
 
 I don't think so.  They also should have a name very similar because
 they have the same semantics with a major bug fixed.  In fact I can't
 think of anyone who would actually want the old behavior.
 

On this point, I agree with Michael. It would be easy to mix up the
names when scanning by eye, so I think there is some value in making
these more visually distinct. I rather like the idea of changing
F_SETLKP to F_*_SETLK. The question is what to put in place of the
wildcard there, and that sort of hinges on the name...

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
On Mon, 21 Apr 2014 20:18:50 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 Jeff,
 On 04/21/2014 06:45 PM, Jeff Layton wrote:
  On Mon, 21 Apr 2014 12:10:04 -0400
  Rich Felker dal...@libc.org wrote:
  
  On Mon, Apr 21, 2014 at 04:23:54PM +0200, Michael Kerrisk (man-pages) 
  wrote:
  On 04/21/2014 04:02 PM, Rich Felker wrote:
  On Mon, Apr 21, 2014 at 09:45:35AM -0400, Jeff Layton wrote:
 [...]
 initial preference, and I also suggested file-description locks
 and noted the drawbacks of that term. I think it's insufficient
 to say stick with the existing poor name--if you have
 something better, then please propose it. (Note by the way
 that for nearly a decade now, the open(2) man page has followed
 POSIX in using the term open file description. Full disclosure:
 of course, I'm responsible for that change in the man page.)
 
  I'm well aware of that. The problem is that the proposed API is using
  the two-letter abbreviation FD, which ALWAYS means file descriptor and
  NEVER means file description (in existing usage) to mean file
  description. That's what's wrong.
 
  
  Fair enough. Assuming we kept file-description locks as a name, what
  would you propose as new macro names?
 
 I assume you meant, assume we kept the term 'file-private locks'...
 In that case, at least make the constants something like
 
 F_FP_SETLK
 F_FP_SETLKW
 F_FP_GETLK
 
 so that they are not confused with the traditional constants.
 
 Cheer,
 

Actually no, I was asking how Rich would name the constants if we use
the name file-description locks (as per the patch I posted this
morning), since his objection was the use if *_FD_* names.

I would assume that if we stick with file-private locks as the name,
then we'll still change the constants to a form like *_FP_*.

Also, to be clear...Frank is correct that the name file-private came
from allowing the locks to be private to a particular open file
description. Though I agree that it's a crappy name at best...

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
On Mon, 21 Apr 2014 14:48:29 -0400
Rich Felker dal...@libc.org wrote:

 On Mon, Apr 21, 2014 at 02:32:38PM -0400, Jeff Layton wrote:
Fair enough. Assuming we kept file-description locks as a name, what
would you propose as new macro names?
   
   I assume you meant, assume we kept the term 'file-private locks'...
   In that case, at least make the constants something like
   
   F_FP_SETLK
   F_FP_SETLKW
   F_FP_GETLK
   
   so that they are not confused with the traditional constants.
   
   Cheer,
   
  
  Actually no, I was asking how Rich would name the constants if we use
  the name file-description locks (as per the patch I posted this
  morning), since his objection was the use if *_FD_* names.
  
  I would assume that if we stick with file-private locks as the name,
  then we'll still change the constants to a form like *_FP_*.
  
  Also, to be clear...Frank is correct that the name file-private came
  from allowing the locks to be private to a particular open file
  description. Though I agree that it's a crappy name at best...
 
 As I mentioned in a reply to Michael just now, I think FP is bad
 because the whole problem is that legacy fcntl locks are associated
 with the underlying file rather than the open file description (open
 instance). So open-private (OP) might be a better choice than
 file-private.
 
 Rich

Is open-private or open-file-private really any better than
file-private ? They're all names that only a mother could love and
I'm not sure any of them are really any clearer than the others. Also:

pedantic
Legacy fcntl locks are associated with the _process_ and not the
underlying file, per-se.
/pedantic


-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] locks: rename file-private locks to file-description locks

2014-04-21 Thread Jeff Layton
On Mon, 21 Apr 2014 21:39:12 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 On 04/21/2014 08:46 PM, Rich Felker wrote:
  On Mon, Apr 21, 2014 at 08:32:44PM +0200, Michael Kerrisk (man-pages) wrote:
  On 04/21/2014 06:10 PM, Rich Felker wrote:
  I'm well aware of that. The problem is that the proposed API is using
  the two-letter abbreviation FD, which ALWAYS means file descriptor and
  NEVER means file description (in existing usage) to mean file
  description. That's what's wrong.
 
  So, can you *please* answer this question: what do you call (i.e., 
  what  everyday technical language term do use for) the thing
  that sits between a file descriptor and an i-node? 
 
  (Please don't say 'struct file' -- that is not is an implementation 
  detail, and does not qualify as the kind of term that I could use 
  when documenting this feature in man pages.)
  
  Open file description.
 
 Oh! I didn't realize we agreed :-).
 
  POSIX uses (or invented, I am not sure which) the term file description
  for a good reason: it is unambiguous, and therefore precise. I do agree
  that there's a risk of confusion between 'open file descriptor and 
  'and file description'--it's the same kind of risk as between English 
  terms such as 'arbitrator' and 'arbitration' (and any number of other
  examples), and as language speakers we deal with this every day.
  
  There's not a problem when the full word is used. On the other hand,
  if you use arb as an abbreviation for arbitration in a context
  where it was already universally understood as meaning arbitrator,
  that would be a big problem.
  
  Likewise the problem here isn't that open file description is a bad
  term. It's that using FD to mean [open] file description is
  utterly confusing, even moreso than just making up a new completely
  random word.
 
 Ohh -- I had thought you a problem not just with FD but also
 (open) file description.
 
  2) The new API constants (F_SETLKP, F_SETLKPW, F_GETLKP) have names
 that are visually very close to the traditional POSIX lock names 
 (F_SETLK, F_SETLKW, F_GETLK). That's an accident waiting to happen
 when someone mistypes in code and/or misses such a misttyping
 when reading code. That really must be fixed.
 
  I agree, but I don't think making it worse is a solution.
 
  I don't agree that it's making it worse. The real problem here is 
  that people use no good unambiguous term for the thing between a file
  descriptor and an inode. POSIX provides us with a solution that may
  not seem perfect, but it is unambiguous, and I think it might feel
  more comfortable if we used it often enough.
  
  I would like to see it used more too, and in particular, I think it
  belongs in the documentation for these new locking interfaces. But
  that still doesn't answer the question of what to call them (the
  macros) unless you want:
  
  F_OPEN_FILE_DESCRIPTION_GETLK
  F_OPEN_FILE_DESCRIPTION_SETLK
  F_OPEN_FILE_DESCRIPTION_SETLKW
 
 Or just 'F_OFD_*'?
 
  Perhaps OP (for open-private, i.e. private to the particular open)
  would be a sensible choice; OTOH people are likely to misread it as
  OPeration. The general principle I have in mind though is that it
  might be nice to highlight the word open in open file description
 
 (Fair enough.)
 
  since it (1) contrasts with file descriptor, despite file descriptors
  also dealing with open files, and (2) contrasts well with legacy fcntl
  locks, which are (this is the whole bug) associated with the
  underlying file and not the open file description.
 
 Makes sense to me. (We are in more agreement that I realized.)
 
 Cheers,
 
 Michael
 
 
 

So the motion is to call them open file description locks and change
the macros to read *_OFD_*. Does anyone object?

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] locks: rename file-private locks to open file description locks

2014-04-22 Thread Jeff Layton
File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. The names and command macros do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.
The consensus on the lists so far is that they should be rechristened as
open file description locks.

The name isn't a big deal for the kernel, but the command macros are not
visually distinct enough from the traditional POSIX lock macros. The
glibc and documentation folks are recommending that we change them to
look like F_OFD_{GETLK|SETLK|SETLKW}. That lessens the chance that a
programmer will typo one of the commands wrong, and also makes it easier
to spot this difference when reading code.

This patch makes the following changes that I think are necessary before
v3.15 ships:

1) rename the command macros to their new names. These end up in the uapi
   headers and so are part of the external-facing API. It turns out that
   glibc doesn't actually use the fcntl.h uapi header, but it's hard to
   be sure that something else won't. Changing it now is safest.

2) make the the /proc/locks output display these as type OFDLCK

The rest of the renaming can wait until v3.16, since everything else
isn't visible outside of the kernel.

Cc: Michael Kerrisk mtk.manpa...@gmail.com
Cc: Christoph Hellwig h...@infradead.org
Cc: Carlos O'Donell car...@redhat.com
Cc: Stefan Metzmacher me...@samba.org
Cc: Andy Lutomirski l...@amacapital.net
Cc: Frank Filz ffilz...@mindspring.com
Cc: Theodore Ts'o ty...@mit.edu
Signed-off-by: Jeff Layton jlay...@redhat.com
---
 arch/arm/kernel/sys_oabi-compat.c |  6 +++---
 fs/compat.c   | 14 +++---
 fs/fcntl.c| 12 ++--
 fs/locks.c| 14 +++---
 include/uapi/asm-generic/fcntl.h  | 20 ++--
 security/selinux/hooks.c  |  6 +++---
 6 files changed, 36 insertions(+), 36 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c 
b/arch/arm/kernel/sys_oabi-compat.c
index 702bd329d9d0..e90a3148f385 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -203,9 +203,9 @@ asmlinkage long sys_oabi_fcntl64(unsigned int fd, unsigned 
int cmd,
int ret;
 
switch (cmd) {
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_OFD_GETLK:
+   case F_OFD_SETLK:
+   case F_OFD_SETLKW:
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
diff --git a/fs/compat.c b/fs/compat.c
index ca926ad0430c..66d3d3c6b4b2 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -457,9 +457,9 @@ COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned 
int, cmd,
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_OFD_GETLK:
+   case F_OFD_SETLK:
+   case F_OFD_SETLKW:
ret = get_compat_flock64(f, compat_ptr(arg));
if (ret != 0)
break;
@@ -468,7 +468,7 @@ COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned 
int, cmd,
conv_cmd = convert_fcntl_cmd(cmd);
ret = sys_fcntl(fd, conv_cmd, (unsigned long)f);
set_fs(old_fs);
-   if ((conv_cmd == F_GETLK || conv_cmd == F_GETLKP)  ret == 0) {
+   if ((conv_cmd == F_GETLK || conv_cmd == F_OFD_GETLK)  ret == 
0) {
/* need to return lock information - see above for 
commentary */
if (f.l_start  COMPAT_LOFF_T_MAX)
ret = -EOVERFLOW;
@@ -493,9 +493,9 @@ COMPAT_SYSCALL_DEFINE3(fcntl, unsigned int, fd, unsigned 
int, cmd,
case F_GETLK64:
case F_SETLK64:
case F_SETLKW64:
-   case F_GETLKP:
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_OFD_GETLK:
+   case F_OFD_SETLK:
+   case F_OFD_SETLKW:
return -EINVAL;
}
return compat_sys_fcntl64(fd, cmd, arg);
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 9ead1596399a..72c82f69b01b 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -274,15 +274,15 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned 
long arg,
break;
 #if BITS_PER_LONG != 32
/* 32-bit arches must use fcntl64() */
-   case F_GETLKP:
+   case F_OFD_GETLK:
 #endif
case F_GETLK:
err = fcntl_getlk(filp, cmd, (struct flock __user *) arg);
break;
 #if BITS_PER_LONG != 32
/* 32-bit arches must use fcntl64() */
-   case F_SETLKP:
-   case F_SETLKPW:
+   case F_OFD_SETLK:
+   case F_OFD_SETLKW:
 #endif
/* Fallthrough */
case F_SETLK:
@@ -399,13 +399,13 @@ SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int

Re: [PATCH] locks: rename file-private locks to open file description locks

2014-04-22 Thread Jeff Layton
On Tue, 22 Apr 2014 17:45:31 +0300
Boaz Harrosh open...@gmail.com wrote:

 On 04/22/2014 03:23 PM, Jeff Layton wrote:
 
  
  We're going to have to live with these for a long time, so it's
  important that we be happy with the names before we're stuck with them.
  The consensus on the lists so far is that they should be rechristened as
  open file description locks.
  
 
 I completely agree with the rename. (Though could you please post
 the rest of the rename patches for review)
 
 Just a very small nit. My native language is not English but I would
 rather you use file-descriptor (with an '-' as well) and not
 use description in the English name of the lock. This is
 because stated like that, description might refer to the
 locks and not to the file in the sentence. file-descriptor is
 more clear I think. (For me it was confusing at first before I realized
 what you meant)
 
 Just my $0.017
 Thanks
 

There's a big difference between the descriptor and the description.

The numerical value you get back from something like open() is a file
descriptor. The thing that that value points to internally in the
kernel is the file description. It's very important that we do not
conflate the two here as these locks are associated with the file
description and not the file descriptor.

The best way to illustrate this is the interaction with dup() -- see
the LWN article on these for a complete overview.

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][glibc PATCH] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-23 Thread Jeff Layton
(I haven't yet sent the patch to rename these to Linus, but here's a
 new draft version of the glibc header/manual updates for open file
 description locks. Please speak up if you see any issues or object
 to the new name.)

Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-23  Jeff Layton  jlay...@redhat.com

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  75 +
 manual/llio.texi   | 237 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 +++
 3 files changed, 326 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..544001769272
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,75 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start(void *arg)
+{
+   int i, fd, len;
+   long tid = (long)arg;
+   char buf[256];
+   struct flock lck = {
+   .l_whence   = SEEK_SET,
+   .l_start= 0,
+   .l_len  = 1,
+   };
+
+   fd = open(/tmp/foo, O_RDWR|O_CREAT, 0666);
+
+   for (i = 0; i  ITERATIONS; i++) {
+   lck.l_type = F_WRLCK;
+   fcntl(fd, F_OFD_SETLKW, lck);
+
+   len = sprintf(buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+   lseek(fd, 0, SEEK_END);
+   write(fd, buf, len);
+   fsync(fd);
+
+   lck.l_type = F_UNLCK;
+   fcntl(fd, F_OFD_SETLK, lck);
+
+   usleep(1);
+   }
+   pthread_exit(NULL);
+}
+
+int
+main(int argc, char **argv)
+{
+   long i;
+   pthread_t   threads[NUM_THREADS];
+
+   truncate(FILENAME, 0);
+
+   for (i = 0; i  NUM_THREADS; i++)
+   pthread_create(threads[i], NULL, thread_start, (void *)i);
+
+   pthread_exit(NULL);
+   return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..b5c2d07ad929 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test a open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear a file lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but wait for completion.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO

Re: [RFC][glibc PATCH] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-04-23 Thread Jeff Layton
On Wed, 23 Apr 2014 11:09:51 -0400
a...@gnu.org (Alfred M. Szmidt) wrote:

@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
 Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 F_GETLK does get the (first) lock which blocks; it doesn't test for
 it.  Retrieves information about the first blocking lock ... or
 something might be better than the original
 
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test a open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
 
 Likewise.  You infact write that it does get the lock information
 later in the document wrt. F_OFD_GETLK.

Sorry, I disagree here...GETLK is really a misnomer, IMO. TESTLK would
have been a better name.

It's a way to test whether a particular lock can be applied, and to
return information about a conflicting lock if it can't. If, for
instance there is no conflicting lock, then you don't get any lock
information back (l_type just gets reset to F_UNLCK).

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RFC] fcntl.2: update manpage with verbiage about file-private locks

2014-04-15 Thread Jeff Layton
On Tue, 15 Apr 2014 22:08:50 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

 Hello Jeff,
 
 On Thu, Jan 9, 2014 at 3:23 PM, Jeff Layton jlay...@redhat.com wrote:
  Please don't merge this yet, as the kernel patches are still a work in
  progress...
 
 Now that this has hit mainline, is this man page patch still current?
 
 Cheers,
 
 Michael
 

No, it needs a bit of a revision. I'm sorting through the glibc patches
now, and will plan to send a respin of this once that's complete.

Thanks!


  Signed-off-by: Jeff Layton jlay...@redhat.com
  ---
   man2/fcntl.2 | 97 
  ++--
   1 file changed, 94 insertions(+), 3 deletions(-)
 
  diff --git a/man2/fcntl.2 b/man2/fcntl.2
  index 72dcd7b..74c67b6 100644
  --- a/man2/fcntl.2
  +++ b/man2/fcntl.2
  @@ -208,7 +208,7 @@ struct flock {
   off_t l_start;   /* Starting offset for lock */
   off_t l_len; /* Number of bytes to lock */
   pid_t l_pid; /* PID of process blocking our lock
  -(F_GETLK only) */
  +(F_GETLK and F_GETLKP only) */
   ...
   };
   .fi
  @@ -344,9 +344,13 @@ returns details about one of these locks in the
   .IR l_type ,  l_whence ,  l_start , and  l_len
   fields of
   .I lock
  -and sets
  +.
  +If the conflicting lock is a traditional POSIX lock, then the
   .I l_pid
  -to be the PID of the process holding that lock.
  +will be set to the PID of the process holding that lock. If the
  +conflicting lock is a file-private lock, then the
  +.I l_pid
  +will be set to -1.
   .P
   In order to place a read lock,
   .I fd
  @@ -386,6 +390,93 @@ should be avoided; use
   and
   .BR write (2)
   instead.
  +.SS File-private locking
  +(Currently non-POSIX, but being proposed)
  +.PP
  +.BR F_GETLKP ,  F_SETLKP  and   F_SETLKPW
  +are used to acquire, release and test file-private record locks. These
  +are byte-range locks that work identically to the traditional advisory
  +record locks described above, but are associated with the open file on
  +which they were acquired rather than the process, much like locks
  +acquired with
  +.BR flock (2)
  +.
  +.PP
  +Unlike traditional advisory record locks, these locks are inherited
  +across
  +.BR fork (2) ,  dup (2)  and  dup2 (2)
  +and are only released on the last close of the open file instead of being
  +released on any close of the file.
  +.PP
  +File-private locks always conflict with traditional record locks, even
  +when they are acquired by the same process on the same file descriptor.
  +They only conflict with each other when they are acquired on different
  +open file descriptors.
  +.TP
  +.BR F_SETLKP  (\fIstruct flock *\fP)
  +Acquire a lock (when
  +.I l_type
  +is
  +.B F_RDLCK
  +or
  +.BR F_WRLCK )
  +or release a lock (when
  +.I l_type
  +is
  +.BR F_UNLCK )
  +on the bytes specified by the
  +.IR l_whence ,  l_start , and  l_len
  +fields of
  +.IR lock .
  +If a conflicting lock is held by another process,
  +this call returns \-1 and sets
  +.I errno
  +to
  +.B EACCES
  +or
  +.BR EAGAIN .
  +.TP
  +.BR F_SETLKPW  (\fIstruct flock *\fP)
  +As for
  +.BR F_SETLKP ,
  +but if a conflicting lock is held on the file, then wait for that
  +lock to be released.
  +If a signal is caught while waiting, then the call is interrupted
  +and (after the signal handler has returned)
  +returns immediately (with return value \-1 and
  +.I errno
  +set to
  +.BR EINTR ;
  +see
  +.BR signal (7)).
  +.TP
  +.BR F_GETLKP  (\fIstruct flock *\fP)
  +On input to this call,
  +.I lock
  +describes a lock we would like to place on the file.
  +If the lock could be placed,
  +.BR fcntl ()
  +does not actually place it, but returns
  +.B F_UNLCK
  +in the
  +.I l_type
  +field of
  +.I lock
  +and leaves the other fields of the structure unchanged.
  +If one or more incompatible locks would prevent
  +this lock being placed, then
  +.BR fcntl ()
  +returns details about one of these locks in the
  +.IR l_type ,  l_whence ,  l_start , and  l_len
  +fields of
  +.I lock
  +.
  +If the conflicting lock is a traditional POSIX lock, then the
  +.I l_pid
  +will be set to the PID of the process holding that lock. If the
  +conflicting lock is a file-private lock, then the
  +.I l_pid
  +will be set to -1.
   .SS Mandatory locking
   (Non-POSIX.)
   The above record locks may be either advisory or mandatory,
  --
  1.8.4.2
 
  --
  To unsubscribe from this list: send the line unsubscribe linux-man in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 


-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

2014-04-16 Thread Jeff Layton
 holding sk_lock
   FS: set PF_FSTRANS while holding mmap_sem in exec.c
   NET: set PF_FSTRANS while holding rtnl_lock
   MM: set PF_FSTRANS while allocating per-cpu memory to avoid deadlock.
   driver core: set PF_FSTRANS while holding gdp_mutex
   nfsd: set PF_FSTRANS when client_mutex is held.
   VFS: use GFP_NOFS rather than GFP_KERNEL in __d_alloc.
   VFS: set PF_FSTRANS while namespace_sem is held.
   nfsd: set PF_FSTRANS during nfsd4_do_callback_rpc.
   XFS: set PF_FSTRANS while ilock is held in xfs_free_eofblocks
 
 
  drivers/base/core.c |3 ++
  drivers/base/power/runtime.c|6 ++---
  drivers/block/nbd.c |6 ++---
  drivers/md/dm-bufio.c   |6 ++---
  drivers/md/dm-ioctl.c   |6 ++---
  drivers/mtd/nand/nandsim.c  |   28 ++---
  drivers/scsi/iscsi_tcp.c|6 ++---
  drivers/usb/core/hub.c  |6 ++---
  fs/dcache.c |4 ++-
  fs/exec.c   |6 +
  fs/fs-writeback.c   |5 ++--
  fs/namespace.c  |4 +++
  fs/nfs/file.c   |3 +-
  fs/nfsd/nfs4callback.c  |5 
  fs/nfsd/nfs4state.c |3 ++
  fs/nfsd/nfssvc.c|   24 ++
  fs/nfsd/vfs.c   |6 +
  fs/xfs/kmem.h   |2 --
  fs/xfs/xfs_aops.c   |7 -
  fs/xfs/xfs_bmap_util.c  |4 +++
  fs/xfs/xfs_file.c   |   12 +
  fs/xfs/xfs_linux.h  |7 -
  include/linux/lockdep.h |8 +++---
  include/linux/sched.h   |   32 +---
  include/linux/sunrpc/svc.h  |2 ++
  include/linux/sunrpc/svc_xprt.h |1 +
  include/net/sock.h  |1 +
  kernel/locking/lockdep.c|   51 
 ---
  kernel/softirq.c|6 ++---
  mm/migrate.c|9 +++
  mm/page-writeback.c |3 ++
  mm/page_alloc.c |   18 --
  mm/percpu.c |4 +++
  mm/slab.c   |2 ++
  mm/slob.c   |2 ++
  mm/slub.c   |1 +
  mm/vmscan.c |   31 +++-
  net/core/dev.c  |6 ++---
  net/core/rtnetlink.c|9 ++-
  net/core/sock.c |8 --
  net/sunrpc/sched.c  |5 ++--
  net/sunrpc/svc.c|6 +
  net/sunrpc/svcsock.c|   10 
  net/sunrpc/xprtrdma/transport.c |5 ++--
  net/sunrpc/xprtsock.c   |   17 -
  45 files changed, 247 insertions(+), 149 deletions(-)
 


-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 05/19] SUNRPC: track whether a request is coming from a loop-back interface.

2014-04-16 Thread Jeff Layton
On Wed, 16 Apr 2014 14:03:36 +1000
NeilBrown ne...@suse.de wrote:

 If an incoming NFS request is coming from the local host, then
 nfsd will need to perform some special handling.  So detect that
 possibility and make the source visible in rq_local.
 
 Signed-off-by: NeilBrown ne...@suse.de
 ---
  include/linux/sunrpc/svc.h  |1 +
  include/linux/sunrpc/svc_xprt.h |1 +
  net/sunrpc/svcsock.c|   10 ++
  3 files changed, 12 insertions(+)
 
 diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
 index 04e763221246..a0dbbd1e00e9 100644
 --- a/include/linux/sunrpc/svc.h
 +++ b/include/linux/sunrpc/svc.h
 @@ -254,6 +254,7 @@ struct svc_rqst {
   u32 rq_prot;/* IP protocol */
   unsigned short
   rq_secure  : 1; /* secure port */
 + unsigned short  rq_local   : 1; /* local request */
  
   void *  rq_argp;/* decoded arguments */
   void *  rq_resp;/* xdr'd results */
 diff --git a/include/linux/sunrpc/svc_xprt.h b/include/linux/sunrpc/svc_xprt.h
 index b05963f09ebf..b99bdfb0fcf9 100644
 --- a/include/linux/sunrpc/svc_xprt.h
 +++ b/include/linux/sunrpc/svc_xprt.h
 @@ -63,6 +63,7 @@ struct svc_xprt {
  #define  XPT_DETACHED10  /* detached from tempsocks list 
 */
  #define XPT_LISTENER 11  /* listening endpoint */
  #define XPT_CACHE_AUTH   12  /* cache auth info */
 +#define XPT_LOCAL13  /* connection from loopback interface */
  
   struct svc_serv *xpt_server;/* service for transport */
   atomic_txpt_reserved;   /* space on outq that is rsvd */
 diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
 index b6e59f0a9475..193115fe968c 100644
 --- a/net/sunrpc/svcsock.c
 +++ b/net/sunrpc/svcsock.c
 @@ -811,6 +811,7 @@ static struct svc_xprt *svc_tcp_accept(struct svc_xprt 
 *xprt)
   struct socket   *newsock;
   struct svc_sock *newsvsk;
   int err, slen;
 + struct dst_entry *dst;
   RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]);
  
   dprintk(svc: tcp_accept %p sock %p\n, svsk, sock);
 @@ -867,6 +868,14 @@ static struct svc_xprt *svc_tcp_accept(struct svc_xprt 
 *xprt)
   }
   svc_xprt_set_local(newsvsk-sk_xprt, sin, slen);
  
 + clear_bit(XPT_LOCAL, newsvsk-sk_xprt.xpt_flags);
 + rcu_read_lock();
 + dst = rcu_dereference(newsock-sk-sk_dst_cache);
 + if (dst  dst-dev 
 + (dst-dev-features  NETIF_F_LOOPBACK))
 + set_bit(XPT_LOCAL, newsvsk-sk_xprt.xpt_flags);
 + rcu_read_unlock();
 +

In the use-case you describe, the client is unlikely to have mounted
localhost, but is more likely to be mounting a floating address in
the cluster.

Will this flag end up being set in such a situation? It looks like
NETIF_F_LOOPBACK isn't set on all adapters -- mostly on lo and a few
others that support MAC-LOOPBACK.


   if (serv-sv_stats)
   serv-sv_stats-nettcpconn++;
  
 @@ -1112,6 +1121,7 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
  
   rqstp-rq_xprt_ctxt   = NULL;
   rqstp-rq_prot= IPPROTO_TCP;
 + rqstp-rq_local   = !!test_bit(XPT_LOCAL, svsk-sk_xprt.xpt_flags);
  
   p = (__be32 *)rqstp-rq_arg.head[0].iov_base;
   calldir = p[1];
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-nfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


should we change the name/macros of file-private locks?

2014-04-16 Thread Jeff Layton
Sorry to spam so many lists, but I think this needs widespread
distribution and consensus.

File-private locks have been merged into Linux for v3.15, and *now*
people are commenting that the name and macro definitions for the new
file-private locks suck.

...and I can't even disagree. They do suck.

We're going to have to live with these for a long time, so it's
important that we be happy with the names before we're stuck with them.

Michael Kerrisk suggested several names but I think the only one that
doesn't have other issues is file-associated locks, which can be
distinguished against process-associated locks (aka classic POSIX
locks).

At the same time, he suggested that we rename the command macros since
the 'P' suffix would no longer be relevant. He suggested something like
this:

F_FA_GETLK
F_FA_SETLK
F_FA_SETLKW

That would also make them more visually distinguishable from the
classic F_GETLK/F_SETLK/F_SETLKW commands. I like that change in
particular, as the original macros names would be easy to typo.

I think we'd also need to rename how these are reported in /proc/locks.
Currently they have a type label of FLPVT. I'd suggest that we
change that to FASSOC. For v3.15, this is the only part we'd
absolutely have to change before it ships. The rest I can fix up in
v3.16.

Does this sound like a reasonable set of changes to make? Does anyone
else have a better set of names they can suggest? Speak now, or forever
hold your peace!

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: should we change the name/macros of file-private locks?

2014-04-17 Thread Jeff Layton
On Thu, 17 Apr 2014 00:42:13 +0200
Stefan (metze) Metzmacher me...@samba.org wrote:

 Am 16.04.2014 22:00, schrieb Michael Kerrisk (man-pages):
  [CC += Jeremy Allison]
  
  On Wed, Apr 16, 2014 at 8:57 PM, Jeff Layton jlay...@redhat.com wrote:
  Sorry to spam so many lists, but I think this needs widespread
  distribution and consensus.
 
  File-private locks have been merged into Linux for v3.15, and *now*
  people are commenting that the name and macro definitions for the new
  file-private locks suck.
 
  ...and I can't even disagree. They do suck.
 
  We're going to have to live with these for a long time, so it's
  important that we be happy with the names before we're stuck with them.
  
  So, to add my perspective: The existing byte-range locking system has
  persisted (despite egregious faults) for well over two decades. One
  supposes that Jeff's new improved version might be around
  at least as long. With that in mind, and before setting in stone (and
  pushing into POSIX) a model of thinking that thousands of programmers
  will live with for a long time, it's worth thinking about names.
  
  Michael Kerrisk suggested several names but I think the only one that
  doesn't have other issues is file-associated locks, which can be
  distinguished against process-associated locks (aka classic POSIX
  locks).
  
  The names I have suggested are:
  
  file-associated locks
  
  or
  
 file-handle locks
  
  or (using POSIX terminology)
  
  file-description locks
 
 I'd use file-handle, file-description or at least something that's
 not associated to the file itself.
 
 file-handle-associated or file-description-associated would also work.
 

Yeah, I understand your point.

I'm not keen on using file-handle as file handles have a completely
different meaning in the context of something like NFS.

file-description-associated is rather a mouthful. You Germans might
go for that sort of thing, but it feels awkward to us knuckle-draggers
that primarily speak English.

Maybe we should just go with one of Michael's earlier suggestions --
file-description locks and change the macros to F_FD_*.

In the docs we could take pains to point out that these are
file-_description_ locks and not file-_descriptor_ locks, and outline
why that is so (which is something I'm trying to make crystal clear in
the docs anyway).

Does anyone object to that?

  but that last might be a bit confusing to people who are not
  standards-aware. (The POSIX standard defines the thing that a file
  descriptor refers to as a file description; other people often call
  that thing a file handle or an open file table entry or a struct
  file. The POSIX term is precise and unambiguous, but, unfortunately,
  the term is not common outside the standard and is also easily
  mistaken for file descriptor.)
  
  At the same time, he suggested that we rename the command macros since
  the 'P' suffix would no longer be relevant. He suggested something like
  this:
 
  F_FA_GETLK
  F_FA_SETLK
  F_FA_SETLKW
 
 With file-description-associated this could be
 
 F_FDA_*
 
 metze


-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: should we change the name/macros of file-private locks?

2014-04-17 Thread Jeff Layton
On Thu, 17 Apr 2014 22:08:54 +0200
Michael Kerrisk (man-pages) mtk.manpa...@gmail.com wrote:

  In the docs we could take pains to point out that these are
  file-_description_ locks and not file-_descriptor_ locks, and outline
  why that is so (which is something I'm trying to make crystal clear in
  the docs anyway).
 
  Does anyone object to that?
 
  Well, I'd be silly to object, but maybe we should still allow a day
  for further comment?
 
 Jeff,
 
 One further point. I know the intent is to get this scheme into POSIX.
 Have any conversations happened about this so far on the POSIX/Austin
 lists? If yes, it might be worth also linking those folks into the
 naming discussion..
 
 Cheers,
 
 Michael

Yes, good idea. The open bug is here:

http://austingroupbugs.net/view.php?id=768

I'll write something up there and see if they want to chime in on the
discussion.

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Thoughts on credential switching

2014-03-31 Thread Jeff Layton
On Sun, 30 Mar 2014 09:03:29 -0400
Theodore Ts'o ty...@mit.edu wrote:

 On Thu, Mar 27, 2014 at 07:08:02AM -0700, Jeff Layton wrote:
  I had some time to think about this last night...
  
  While using a fd to pass around credentials is convenient, the danger
  is that it's pretty opaque. You have a fd that you know has creds
  attached to it, but it's hard to be certain what is going to change.
 
 I don't think that's a particularly tough problem.  In general, the fd
 isn't something that you would want to pass around, and so the process
 which generated it will know exactly what it contained.
 

I think there's a bit more of a use-case for passing around such an fd
via socket...

Part of the problem is that the traditional uid/gid switching glibc
wrappers are per-process. If we're proposing doing something like:

seteuid()
setegid()
setgroups()
fd = open()
(...and then revert the creds using same syscalls)

...during the time that you're doing all of that, you can't really
allow any thread in the process to be doing something that requires
_other_ creds until you've completed the above.

So, I could envision a program like ganesha firing up a separate
process to handle the credential switching and fd creation and then
handing those back to the main process via a unix domain socket.

  Perhaps we can use the flags field for that. So, assuming we have a fd
  with the creds attached, we could do something like:
  
  err = switch_creds(fd, SC_FSUID|SC_FSGID|SC_GROUPS);
  
  ...then the switch_creds syscall could be set up to fail if the new
  credentials had other fields that didn't match those in the current
  task credentials. So if (for instance) the cred-euid were
  different between the two, the above could fail with -EINVAL or
  something.
 
 Huh?  The whole *point* is that the creds value will be different, of
 course they won't match!  I would think this would be over
 complicating the interface.
 

You know that *some* of the credentials won't match, but presumably
some of them (e.g. the real uid/gid) should match. Using a flags field
could act as a mask of what creds are allowed to change.

 
 A couple of other things.  What I would suggest is that we create a
 few new fd flags, to join FD_CLOEXEC:
 
 FD_NOPROCFS   disallow being able to open the inode via /proc/pid/fd
   (but in the case of a creds fd, for bonus points, the
   target of the pseudo-symlink could be something like:
   uid: 15806 gid: 100: grps: 27, 50 to aid in debugging
   a userspace file server).  This also answers Jeff's concern
   if for some reason --- I don't know how --- a process
   doesn't know what the contents of the creds fd that
   it created itself.
 
 FD_NOPASSFD   disallow being able to pass the fd via a unix domain socket
 
 FD_LOCKFLAGS  if this bit is set, disallow any further changes of FD_CLOEXEC,
   FD_NOPROCFS, FD_NOPASSFD, and FD_LOCKFLAGS flags.
 
 Some of the functionality requested by the folks suggesting the SEAL
 API would also be covered by these fd flags.
 
 In order to solve some potential race concerns, a credsfd must be
 created with FD_CLOEXEC and FD_NOPROCFS enabled.
 
 Why is this important even if the anon_inode is owned by root with a
 mode of 0?  Because if the system is set up to use SELinux or full
 Posix capabilities, merely having the a uid of 0 is not special, and
 we don't want to allow a process with uid of 0 to be able modify the
 mode with the /proc/pid/fd/FD and then proceed to open the inode
 using open.  This way, instead of adding special case code to prevent
 this from happening, we can add a more general facility which can be
 used to solve a few other problems.
 



-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] please pull file locking changes for 3.15

2014-03-31 Thread Jeff Layton
The following changes since commit 29723adee11804b548903ddb1db666cf4a60f60e:

  locks: make locks_mandatory_area check for file-private locks (2014-03-31 
08:24:43 -0400)

are available in the git repository at:

  git://git.samba.org/jlayton/linux.git locks-3.15

for you to fetch changes up to 29723adee11804b548903ddb1db666cf4a60f60e:

  locks: make locks_mandatory_area check for file-private locks (2014-03-31 
08:24:43 -0400)


(from the branch description for locks-3.15 local branch)

file locking patches for 3.15


Highlights:

- maintainership change for fs/locks.c. Willy's not interested in
  maintaining it these days, and is OK with Bruce and I taking it.
- fix for open vs. setlease race that Al ID'ed
- cleanup and consolidation of file locking code
- eliminate unneeded BUG() call
- merge of file-private lock implementation

-- 
Jeff Layton jlay...@redhat.com


signature.asc
Description: PGP signature


Re: Thoughts on credential switching

2014-03-31 Thread Jeff Layton
On Mon, 31 Mar 2014 14:06:01 -0400
Trond Myklebust trond.mykleb...@primarydata.com wrote:

 
 On Mar 31, 2014, at 7:51, Jeff Layton jlay...@redhat.com wrote:
 
  On Sun, 30 Mar 2014 09:03:29 -0400
  Theodore Ts'o ty...@mit.edu wrote:
  
  On Thu, Mar 27, 2014 at 07:08:02AM -0700, Jeff Layton wrote:
  I had some time to think about this last night...
  
  While using a fd to pass around credentials is convenient, the danger
  is that it's pretty opaque. You have a fd that you know has creds
  attached to it, but it's hard to be certain what is going to change.
  
  I don't think that's a particularly tough problem.  In general, the fd
  isn't something that you would want to pass around, and so the process
  which generated it will know exactly what it contained.
  
  
  I think there's a bit more of a use-case for passing around such an fd
  via socket...
  
  Part of the problem is that the traditional uid/gid switching glibc
  wrappers are per-process. If we're proposing doing something like:
  
  seteuid()
  setegid()
  setgroups()
  fd = open()
  (...and then revert the creds using same syscalls)
  
  ...during the time that you're doing all of that, you can't really
  allow any thread in the process to be doing something that requires
  _other_ creds until you've completed the above.
 
 Umm… open() isn’t the only operation that you want to be able to do with an 
 assumed user identity. You want mknod(), mkdir(), link(), unlink(), … Pretty 
 much any interaction with the underlying filesystem needs to use the right 
 identity.
 

The proposal that Al originally had was to have userland set up the
credentials that it wanted to use, and then open(/dev/null, ...) to
get a fd that has those creds attached.

Then, we'd add a syscall that takes such an fd and switches to the creds
attached to it. So the open() in the above example isn't to do an actual
operation but rather to create a fd to act as a credential handle.

  So, I could envision a program like ganesha firing up a separate
  process to handle the credential switching and fd creation and then
  handing those back to the main process via a unix domain socket.
 
 How about using the keyrings interface to atomically cache and retrieve these 
 user identities? We already have support for different types of keys that 
 store/retrieve different types of structured information. How is this so 
 different?
 

That's an interesting idea. 
-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH resend #4] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-06-26 Thread Jeff Layton
From: Jeff Layton jlay...@poochiereds.net

Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@poochiereds.net

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* manual/examples/ofdlocks.c:
  example of open file description lock usage

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated

[PATCH resend #3] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-06-20 Thread Jeff Layton
From: Jeff Layton jlay...@poochiereds.net

Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@poochiereds.net

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* manual/examples/ofdlocks.c:
  example of open file description lock usage

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated

[GIT PULL] please pull file locking bugfixes for v3.16 (pile #2)

2014-06-21 Thread Jeff Layton
The following changes since commit 64b2d1fbbfda07765dae3f601862796a61b2c451:

  Merge tag 'for-f2fs-3.16' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs (2014-06-09 19:11:44 
-0700)

are available in the git repository at:


  git://git.samba.org/jlayton/linux.git tags/locks-v3.16-2

for you to fetch changes up to 0c27362998a8357f199501aa401e99c51c2eb46e:

  locks: set fl_owner for leases back to current-files (2014-06-10 12:29:05 
-0400)


(from the branch description for locks-3.16 local branch)

File locking related bugfixes for v3.16 (pile #2)

Nothing too earth-shattering here. A fix for a potential regression
due to a patch in pile #1, and the addition of a memory barrier to
prevent a race condition between break_deleg and generic_add_lease.


Jeff Layton (2):
  locks: add missing memory barrier in break_deleg
  locks: set fl_owner for leases back to current-files

 fs/locks.c | 2 +-
 include/linux/fs.h | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

-- 
Jeff Layton jlay...@poochiereds.net


signature.asc
Description: PGP signature


Re: [PATCH v4] fs: umount on symlink leaks mnt count

2014-07-21 Thread Jeff Layton
On Mon, 21 Jul 2014 12:30:23 +0400
Vasily Averin v...@parallels.com wrote:

 v4: description corrected
 v3: patch inline
 
 Currently umount on symlink blocks following umount:
 
 /vz is separate mount
 
 # ls /vz/ -al | grep test
 drwxr-xr-x.  2 root root   4096 Jul 19 01:14 testdir
 lrwxrwxrwx.  1 root root 11 Jul 19 01:16 testlink - /vz/testdir
 # umount -l /vz/testlink
 umount: /vz/testlink: not mounted (expected)
 
 # lsof /vz
 # umount /vz
 umount: /vz: device is busy. (unexpected)
 
 In this case mountpoint_last() gets an extra refcount on path-mnt
 
 Signed-off-by: Vasily Averin v...@openvz.org
 ---
  fs/namei.c |3 ++-
  1 files changed, 2 insertions(+), 1 deletions(-)
 
 diff --git a/fs/namei.c b/fs/namei.c
 index 985c6f3..9eb787e 100644
 --- a/fs/namei.c
 +++ b/fs/namei.c
 @@ -2256,9 +2256,10 @@ done:
   goto out;
   }
   path-dentry = dentry;
 - path-mnt = mntget(nd-path.mnt);
 + path-mnt = nd-path.mnt;
   if (should_follow_link(dentry, nd-flags  LOOKUP_FOLLOW))
   return 1;
 + mntget(path-mnt);
   follow_mount(path);
   error = 0;
  out:

Looks correct, I think...

Acked-by:
-- 
Jeff Layton jlay...@primarydata.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH] dlm: Remove unused conf from lm_grant

2014-07-01 Thread Jeff Layton
On Tue, 01 Jul 2014 06:20:10 -0700
Joe Perches j...@perches.com wrote:

 While doing a bit of adding argument names to fs.h,
 I looked at lm_grant and it seems the 2nd argument
 is always NULL.
 
 How about removing it?
 
 This doesn't apply as it depends on some other patches
 but it should be clear enough...
 

ACK on the general idea from my standpoint. Anything that simplifies
the file locking interfaces is a good thing, particularly the deferred
locking code.

 ---
  fs/dlm/plock.c |  8 
  fs/lockd/svclock.c | 12 +++-
  include/linux/fs.h |  2 +-
  3 files changed, 8 insertions(+), 14 deletions(-)
 
 diff --git a/fs/dlm/plock.c b/fs/dlm/plock.c
 index e59d332..e0ab3a9 100644
 --- a/fs/dlm/plock.c
 +++ b/fs/dlm/plock.c
 @@ -30,7 +30,7 @@ struct plock_op {
  
  struct plock_xop {
   struct plock_op xop;
 - int (*callback)(struct file_lock *, struct file_lock *, int);
 + int (*callback)(struct file_lock *fl, int result);
   void *fl;
   void *file;
   struct file_lock flc;
 @@ -190,7 +190,7 @@ static int dlm_plock_callback(struct plock_op *op)
   struct file *file;
   struct file_lock *fl;
   struct file_lock *flc;
 - int (*notify)(struct file_lock *fl, struct file_lock *cont, int result) 
 = NULL;
 + int (*notify)(struct file_lock *fl, int result) = NULL;
   struct plock_xop *xop = (struct plock_xop *)op;
   int rv = 0;
  
 @@ -209,7 +209,7 @@ static int dlm_plock_callback(struct plock_op *op)
   notify = xop-callback;
  
   if (op-info.rv) {
 - notify(fl, NULL, op-info.rv);
 + notify(fl, op-info.rv);
   goto out;
   }
  
 @@ -228,7 +228,7 @@ static int dlm_plock_callback(struct plock_op *op)
 (unsigned long long)op-info.number, file, fl);
   }
  
 - rv = notify(fl, NULL, 0);
 + rv = notify(fl, 0);
   if (rv) {
   /* XXX: We need to cancel the fs lock here: */
   log_print(dlm_plock_callback: lock granted after lock request 
 diff --git a/fs/lockd/svclock.c b/fs/lockd/svclock.c
 index ab798a8..2a61701 100644
 --- a/fs/lockd/svclock.c
 +++ b/fs/lockd/svclock.c
 @@ -667,22 +667,16 @@ nlmsvc_cancel_blocked(struct net *net, struct nlm_file 
 *file, struct nlm_lock *l
   * deferred rpc for GETLK and SETLK.
   */
  static void
 -nlmsvc_update_deferred_block(struct nlm_block *block, struct file_lock *conf,
 -  int result)
 +nlmsvc_update_deferred_block(struct nlm_block *block, int result)
  {
   block-b_flags |= B_GOT_CALLBACK;
   if (result == 0)
   block-b_granted = 1;
   else
   block-b_flags |= B_TIMED_OUT;
 - if (conf) {
 - if (block-b_fl)
 - __locks_copy_lock(block-b_fl, conf);
 - }
  }
  
 -static int nlmsvc_grant_deferred(struct file_lock *fl, struct file_lock 
 *conf,
 - int result)
 +static int nlmsvc_grant_deferred(struct file_lock *fl, int result)
  {
   struct nlm_block *block;
   int rc = -ENOENT;
 @@ -697,7 +691,7 @@ static int nlmsvc_grant_deferred(struct file_lock *fl, 
 struct file_lock *conf,
   rc = -ENOLCK;
   break;
   }
 - nlmsvc_update_deferred_block(block, conf, 
 result);
 + nlmsvc_update_deferred_block(block, result);
   } else if (result == 0)
   block-b_granted = 1;
  
 diff --git a/include/linux/fs.h b/include/linux/fs.h
 index 36b8648..6150125 100644
 --- a/include/linux/fs.h
 +++ b/include/linux/fs.h
 @@ -842,7 +842,7 @@ struct lock_manager_operations {
   int (*lm_compare_owner)(struct file_lock *fl1, struct file_lock *fl2);
   unsigned long (*lm_owner_key)(struct file_lock *fl);
   void (*lm_notify)(struct file_lock *fl);/* unblock callback */
 - int (*lm_grant)(struct file_lock *fl, struct file_lock *conf, int 
 result);
 + int (*lm_grant)(struct file_lock *fl, int result);
   void (*lm_break)(struct file_lock *fl);
   int (*lm_change)(struct file_lock **fl, int type);
  };


-- 
Jeff Layton jlay...@poochiereds.net
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH resend #5] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-07-08 Thread Jeff Layton
From: Jeff Layton jlay...@poochiereds.net

Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@poochiereds.net

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* manual/examples/ofdlocks.c:
  example of open file description lock usage

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated

Re: [PATCH resend #5] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-07-08 Thread Jeff Layton
On Tue,  8 Jul 2014 08:16:57 -0400
Jeff Layton jlay...@primarydata.com wrote:

 From: Jeff Layton jlay...@poochiereds.net
 
 Open file description locks have been merged into the Linux kernel for
 v3.15.  Add the appropriate command-value definitions and an update to
 the manual that describes their usage.
 

...and to follow up on this, I'm really getting quite exasperated. This
patch has been done for several months now. There have been no review
comments on it since Ondřej's comment about the changelog in late May.
Michael K. merged the manpage patches months ago. The only remaining
piece is for glibc to merge this patch. Why is it taking so long to get
a simple header file and documentation patch merged?

 ChangeLog:
 
 2014-04-24  Jeff Layton  jlay...@poochiereds.net
 
   [BZ#16839]
   * manual/llio.texi: add section about open file description locks
 
   * manual/examples/ofdlocks.c:
 example of open file description lock usage
 
   * sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
 (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
 ---
  manual/examples/ofdlocks.c |  77 +
  manual/llio.texi   | 241 
 -
  sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
  3 files changed, 332 insertions(+), 3 deletions(-)
  create mode 100644 manual/examples/ofdlocks.c
 
 diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
 new file mode 100644
 index ..85e193cdabe6
 --- /dev/null
 +++ b/manual/examples/ofdlocks.c
 @@ -0,0 +1,77 @@
 +/* Open File Description Locks Usage Example
 +   Copyright (C) 1991-2014 Free Software Foundation, Inc.
 +
 +   This program is free software; you can redistribute it and/or
 +   modify it under the terms of the GNU General Public License
 +   as published by the Free Software Foundation; either version 2
 +   of the License, or (at your option) any later version.
 +
 +   This program is distributed in the hope that it will be useful,
 +   but WITHOUT ANY WARRANTY; without even the implied warranty of
 +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 +   GNU General Public License for more details.
 +
 +   You should have received a copy of the GNU General Public License
 +   along with this program; if not, see http://www.gnu.org/licenses/.
 +*/
 +
 +#define _GNU_SOURCE
 +#include stdio.h
 +#include sys/types.h
 +#include sys/stat.h
 +#include unistd.h
 +#include fcntl.h
 +#include pthread.h
 +
 +#define FILENAME /tmp/foo
 +#define NUM_THREADS  3
 +#define ITERATIONS   5
 +
 +void *
 +thread_start (void *arg)
 +{
 +  int i, fd, len;
 +  long tid = (long) arg;
 +  char buf[256];
 +  struct flock lck = {
 +.l_whence = SEEK_SET,
 +.l_start = 0,
 +.l_len = 1,
 +  };
 +
 +  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
 +
 +  for (i = 0; i  ITERATIONS; i++)
 +{
 +  lck.l_type = F_WRLCK;
 +  fcntl (fd, F_OFD_SETLKW, lck);
 +
 +  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
 +
 +  lseek (fd, 0, SEEK_END);
 +  write (fd, buf, len);
 +  fsync (fd);
 +
 +  lck.l_type = F_UNLCK;
 +  fcntl (fd, F_OFD_SETLK, lck);
 +
 +  /* sleep to ensure lock is yielded to another thread */
 +  usleep (1);
 +}
 +  pthread_exit (NULL);
 +}
 +
 +int
 +main (int argc, char **argv)
 +{
 +  long i;
 +  pthread_t threads[NUM_THREADS];
 +
 +  truncate (FILENAME, 0);
 +
 +  for (i = 0; i  NUM_THREADS; i++)
 +pthread_create (threads[i], NULL, thread_start, (void *) i);
 +
 +  pthread_exit (NULL);
 +  return 0;
 +}
 diff --git a/manual/llio.texi b/manual/llio.texi
 index 6f8adfc607d7..864060dc7140 100644
 --- a/manual/llio.texi
 +++ b/manual/llio.texi
 @@ -57,6 +57,10 @@ directly.)
   flags associated with open files.
  * File Locks::  Fcntl commands for implementing
   file locking.
 +* Open File Description Locks:: Fcntl commands for implementing
 + open file description locking.
 +* Open File Description Locks Example:: An example of open file description 
 lock
 + usage
  * Interrupt Input:: Getting an asynchronous signal when
   input arrives.
  * IOCTLs::  Generic I/O Control operations.
 @@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
 Status Flags}.
  Set flags associated with the open file.  @xref{File Status Flags}.
  
  @item F_GETLK
 -Get a file lock.  @xref{File Locks}.
 +Test a file lock.  @xref{File Locks}.
  
  @item F_SETLK
  Set or clear a file lock.  @xref{File Locks}.
 @@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
  @item F_SETLKW
  Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
  
 +@item F_OFD_GETLK
 +Test an open file

[PATCH resend #6] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-07-14 Thread Jeff Layton
From: Jeff Layton jlay...@poochiereds.net

Open file description locks have been merged into the Linux kernel for
v3.15.  Add the appropriate command-value definitions and an update to
the manual that describes their usage.

ChangeLog:

2014-04-24  Jeff Layton  jlay...@poochiereds.net

[BZ#16839]
* manual/llio.texi: add section about open file description locks

* manual/examples/ofdlocks.c:
  example of open file description lock usage

* sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
  (F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
---
 manual/examples/ofdlocks.c |  77 +
 manual/llio.texi   | 241 -
 sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++
 3 files changed, 332 insertions(+), 3 deletions(-)
 create mode 100644 manual/examples/ofdlocks.c

diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
new file mode 100644
index ..85e193cdabe6
--- /dev/null
+++ b/manual/examples/ofdlocks.c
@@ -0,0 +1,77 @@
+/* Open File Description Locks Usage Example
+   Copyright (C) 1991-2014 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or
+   modify it under the terms of the GNU General Public License
+   as published by the Free Software Foundation; either version 2
+   of the License, or (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program; if not, see http://www.gnu.org/licenses/.
+*/
+
+#define _GNU_SOURCE
+#include stdio.h
+#include sys/types.h
+#include sys/stat.h
+#include unistd.h
+#include fcntl.h
+#include pthread.h
+
+#define FILENAME   /tmp/foo
+#define NUM_THREADS3
+#define ITERATIONS 5
+
+void *
+thread_start (void *arg)
+{
+  int i, fd, len;
+  long tid = (long) arg;
+  char buf[256];
+  struct flock lck = {
+.l_whence = SEEK_SET,
+.l_start = 0,
+.l_len = 1,
+  };
+
+  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
+
+  for (i = 0; i  ITERATIONS; i++)
+{
+  lck.l_type = F_WRLCK;
+  fcntl (fd, F_OFD_SETLKW, lck);
+
+  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
+
+  lseek (fd, 0, SEEK_END);
+  write (fd, buf, len);
+  fsync (fd);
+
+  lck.l_type = F_UNLCK;
+  fcntl (fd, F_OFD_SETLK, lck);
+
+  /* sleep to ensure lock is yielded to another thread */
+  usleep (1);
+}
+  pthread_exit (NULL);
+}
+
+int
+main (int argc, char **argv)
+{
+  long i;
+  pthread_t threads[NUM_THREADS];
+
+  truncate (FILENAME, 0);
+
+  for (i = 0; i  NUM_THREADS; i++)
+pthread_create (threads[i], NULL, thread_start, (void *) i);
+
+  pthread_exit (NULL);
+  return 0;
+}
diff --git a/manual/llio.texi b/manual/llio.texi
index 6f8adfc607d7..864060dc7140 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -57,6 +57,10 @@ directly.)
  flags associated with open files.
 * File Locks::  Fcntl commands for implementing
  file locking.
+* Open File Description Locks:: Fcntl commands for implementing
+ open file description locking.
+* Open File Description Locks Example:: An example of open file description 
lock
+ usage
 * Interrupt Input:: Getting an asynchronous signal when
  input arrives.
 * IOCTLs::  Generic I/O Control operations.
@@ -2890,7 +2894,7 @@ Get flags associated with the open file.  @xref{File 
Status Flags}.
 Set flags associated with the open file.  @xref{File Status Flags}.
 
 @item F_GETLK
-Get a file lock.  @xref{File Locks}.
+Test a file lock.  @xref{File Locks}.
 
 @item F_SETLK
 Set or clear a file lock.  @xref{File Locks}.
@@ -2898,6 +2902,18 @@ Set or clear a file lock.  @xref{File Locks}.
 @item F_SETLKW
 Like @code{F_SETLK}, but wait for completion.  @xref{File Locks}.
 
+@item F_OFD_GETLK
+Test an open file description lock.  @xref{Open File Description Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLK
+Set or clear an open file description lock.  @xref{Open File Description 
Locks}.
+Specific to Linux.
+
+@item F_OFD_SETLKW
+Like @code{F_OFD_SETLK}, but block until lock is acquired.
+@xref{Open File Description Locks}.  Specific to Linux.
+
 @item F_GETOWN
 Get process or process group ID to receive @code{SIGIO} signals.
 @xref{Interrupt Input}.
@@ -3576,6 +3592,10 @@ set_nonblock_flag (int desc, int value)
 
 @cindex file locks
 @cindex record locking
+This section describes record locks that are associated

[PATCH] net: clean up some sparse endianness warnings in ipv6.h

2014-07-14 Thread Jeff Layton
sparse is throwing warnings when building sunrpc modules due to some
endianness shenanigans in ipv6.h. Sprinkle some endianness fixups to
silence them. These should all get fixed up at compile time, so I don't
think this will add any extra work to be done at runtime.

Signed-off-by: Jeff Layton jlay...@primarydata.com
---
 include/net/ipv6.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 574337fe72dd..5ed2c24fe950 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -557,9 +557,9 @@ static inline u32 __ipv6_addr_jhash(const struct in6_addr 
*a, const u32 initval)
 static inline bool ipv6_addr_loopback(const struct in6_addr *a)
 {
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
-   const unsigned long *ul = (const unsigned long *)a;
+   const __be64 *be = (const __be64 *)a;
 
-   return (ul[0] | (ul[1] ^ cpu_to_be64(1))) == 0UL;
+   return (be[0] | (be[1] ^ cpu_to_be64(1))) == cpu_to_be64(0UL);
 #else
return (a-s6_addr32[0] | a-s6_addr32[1] |
a-s6_addr32[2] | (a-s6_addr32[3] ^ htonl(1))) == 0;
@@ -570,11 +570,11 @@ static inline bool ipv6_addr_v4mapped(const struct 
in6_addr *a)
 {
return (
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
-   *(__be64 *)a |
+   (*(__be64 *)a != cpu_to_be64(0)) |
 #else
-   (a-s6_addr32[0] | a-s6_addr32[1]) |
+   ((a-s6_addr32[0] | a-s6_addr32[1]) != cpu_to_be32(0)) |
 #endif
-   (a-s6_addr32[2] ^ htonl(0x))) == 0UL;
+   ((a-s6_addr32[2] ^ htonl(0x)) == cpu_to_be32(0)));
 }
 
 /*
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] nfsd: silence sparse warning about accessing credentials

2014-07-15 Thread Jeff Layton
sparse says:

fs/nfsd/auth.c:31:38: warning: incorrect type in argument 1 (different 
address spaces)
fs/nfsd/auth.c:31:38:expected struct cred const *cred
fs/nfsd/auth.c:31:38:got struct cred const [noderef] asn:4*real_cred

Add a new accessor for the -real_cred and use that to fetch the
pointer. Accessing current-real_cred directly is actually quite safe
since we know that they can't go away so this is mostly a cosmetic fixup
to silence sparse.

Signed-off-by: Jeff Layton jlay...@primarydata.com
---
 fs/nfsd/auth.c   | 2 +-
 include/linux/cred.h | 9 +
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/nfsd/auth.c b/fs/nfsd/auth.c
index 72f44823adbb..9d46a0bdd9f9 100644
--- a/fs/nfsd/auth.c
+++ b/fs/nfsd/auth.c
@@ -28,7 +28,7 @@ int nfsd_setuser(struct svc_rqst *rqstp, struct svc_export 
*exp)
validate_process_creds();
 
/* discard any old override before preparing the new set */
-   revert_creds(get_cred(current-real_cred));
+   revert_creds(get_cred(current_real_cred()));
new = prepare_creds();
if (!new)
return -ENOMEM;
diff --git a/include/linux/cred.h b/include/linux/cred.h
index f61d6c8f5ef3..b2d0820837c4 100644
--- a/include/linux/cred.h
+++ b/include/linux/cred.h
@@ -259,6 +259,15 @@ static inline void put_cred(const struct cred *_cred)
rcu_dereference_protected(current-cred, 1)
 
 /**
+ * current_real_cred - Access the current task's objective credentials
+ *
+ * Access the objective credentials of the current task.  RCU-safe,
+ * since nobody else can modify it.
+ */
+#define current_real_cred() \
+   rcu_dereference_protected(current-real_cred, 1)
+
+/**
  * __task_cred - Access a task's objective credentials
  * @task: The task to query
  *
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: clean up some sparse endianness warnings in ipv6.h

2014-07-15 Thread Jeff Layton
On Tue, 15 Jul 2014 10:01:07 -0700
Christoph Hellwig h...@infradead.org wrote:

   #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
  -   const unsigned long *ul = (const unsigned long *)a;
  +   const __be64 *be = (const __be64 *)a;
   
  -   return (ul[0] | (ul[1] ^ cpu_to_be64(1))) == 0UL;
  +   return (be[0] | (be[1] ^ cpu_to_be64(1))) == cpu_to_be64(0UL);
 
 Do you need the swap for 0UL?  I know sparse treats 0 as special, so why
 wouldn't it treat 0UL special?  Or just remove the 0UL postfix, no need
 for it in a simple comparism.
 
 Otherwise looks fine to me.

Maybe not, I did it for completeness sake. I'll see if I can remove
that. The macros do the conversion at compile time though so it
shouldn't hurt anything either way.

-- 
Jeff Layton jlay...@primarydata.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] net: clean up some sparse endianness warnings in ipv6.h

2014-07-15 Thread Jeff Layton
On Mon, 14 Jul 2014 08:25:46 -0400
Jeff Layton jlay...@primarydata.com wrote:

 sparse is throwing warnings when building sunrpc modules due to some
 endianness shenanigans in ipv6.h. Sprinkle some endianness fixups to
 silence them. These should all get fixed up at compile time, so I don't
 think this will add any extra work to be done at runtime.
 
 Signed-off-by: Jeff Layton jlay...@primarydata.com
 ---
  include/net/ipv6.h | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 diff --git a/include/net/ipv6.h b/include/net/ipv6.h
 index 574337fe72dd..5ed2c24fe950 100644
 --- a/include/net/ipv6.h
 +++ b/include/net/ipv6.h
 @@ -557,9 +557,9 @@ static inline u32 __ipv6_addr_jhash(const struct in6_addr 
 *a, const u32 initval)
  static inline bool ipv6_addr_loopback(const struct in6_addr *a)
  {
  #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
 - const unsigned long *ul = (const unsigned long *)a;
 + const __be64 *be = (const __be64 *)a;
  
 - return (ul[0] | (ul[1] ^ cpu_to_be64(1))) == 0UL;
 + return (be[0] | (be[1] ^ cpu_to_be64(1))) == cpu_to_be64(0UL);
  #else
   return (a-s6_addr32[0] | a-s6_addr32[1] |
   a-s6_addr32[2] | (a-s6_addr32[3] ^ htonl(1))) == 0;
 @@ -570,11 +570,11 @@ static inline bool ipv6_addr_v4mapped(const struct 
 in6_addr *a)
  {
   return (
  #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
 - *(__be64 *)a |
 + (*(__be64 *)a != cpu_to_be64(0)) |
  #else
 - (a-s6_addr32[0] | a-s6_addr32[1]) |
 + ((a-s6_addr32[0] | a-s6_addr32[1]) != cpu_to_be32(0)) |
  #endif
 - (a-s6_addr32[2] ^ htonl(0x))) == 0UL;
 + ((a-s6_addr32[2] ^ htonl(0x)) == cpu_to_be32(0)));
  }
  
  /*

Oof. I think I had the ipv6_addr_v4mapped changes wrong before due to
misreading the code. The existing code basically takes the different
values, does a bunch of bitwise operations on them and then compares
the result to 0.

My patch had it doing more than one comparison. I've got a fixed
version, but it does mean that we need to use a __force cast in one
place. I'll resend once I've tested it our some more.

Sorry for the noise!
-- 
Jeff Layton jlay...@primarydata.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 00/13] locks: implement filp-private (aka UNPOSIX) locks

2013-12-19 Thread Jeff Layton
This patchset is the fourth posting of this set. Behavior between this
set and the last should be more or less the same. Here is a summary of
changes:

v4:
- prefixed the set with the rest of the locks.c patches I have.
- added patch to get rid of BUG() call in locks_remove_flock. I think
  a WARN  + delete the lock is sufficient there.
- added patches from Bruce to consolidate the struct flock/flock64
  to file_lock conversion code
- fixed locks_remove_file not to assume that filp-private locks won't
  be added on filesystems that have a -lock method.

v3:
- more consolidation of common code between flock_to_posix_lock and
  flock_to_posix_lock64
- better bisectability by reordering changes, such that partial
  implementation won't be exposed
- s/filp/file/ s/FILP/FILE/ in symbol names

v2:
- inheritance semantics have been change to be more BSD-lock like
- patchset size has been reduced by changing how lock ownership
  is handled
- new F_UNLCKP l_type value has been added

On the last set, I made note that we could consider implementing this
with new cmd values instead of new l_type values. That's still doable
but I haven't made that change in this set. I'm still open to that
change, but I'd like to hear from others as to which they'd prefer.

Note too that I've gone ahead and opened a request for the POSIX folks
to consider adding this to the POSIX spec once we have something
mergeable. They seem amenable to the idea but don't want to enshrine it
into the standard until there's a real implementation of it:

http://austingroupbugs.net/view.php?id=768

I also owe them a better writeup of the semantics for these new locks,
but have been holding off on doing that until they're a little more
settled.

Original cover letter from v1 posting follows. Comments and suggestions
welcome.

---[snip]--

At LSF this year, there was a discussion about the wishlist for
userland file servers. One of the things brought up was the goofy and
problematic behavior of POSIX locks when a file is closed. Boaz started
a thread on it here:

http://permalink.gmane.org/gmane.linux.file-systems/73364

Userland fileservers often need to maintain more than one open file
descriptor on a file. The POSIX spec says:

All locks associated with a file for a given process shall be removed
 when a file descriptor for that file is closed by that process or the
 process holding that file descriptor terminates.

This is problematic since you can't close any file descriptor without
dropping all your POSIX locks. Most userland file servers therefore
end up opening the file with more access than is really necessary, and
keeping fd's open for longer than is necessary to work around this.

This patchset is a first stab at an approach to address this problem by
adding two new l_type values -- F_RDLCKP and F_WRLCKP (the 'P' is short
for private -- I'm open to changing that if you have a better
mnemonic).

For all intents and purposes these lock types act just like their
non-P counterpart. The difference is that they are only implicitly
released when the fd against which they were acquired is closed. As a
side effect, these locks cannot be merged with non-P locks since they
have different semantics on close.

I've given this patchset some very basic smoke testing and it seems to
do the right thing, but it is still pretty rough. If this looks
reasonable I'll plan to do some documentation updates and will take a
stab at trying to get these new lock types added to the POSIX spec (as
HCH recommended).

At this point, my main questions are:

1) does this look useful, particularly for fileserver implementors?

2) does this look OK API-wise? We could consider different cmd values
   or even different syscalls, but I figured this makes it clearer that
   P and non-P locks will still conflict with one another.


J. Bruce Fields (2):
  locks: consolidate common code in the flock_to_posix_lock routines
  locks: simplify overflow checking

Jeff Layton (11):
  locks: close potential race between setlease and open
  locks: clean up comment typo
  locks: remove inline qualifier from fl_link manipulation functions
  locks: add __acquires and __releases annotations to locks_start and
locks_stop
  locks: eliminate BUG() call when there's an unexpected lock on file
close
  locks: consolidate checks for compatible filp-f_mode values in setlk
handlers
  locks: don't reference original flock struct in F_GETLK handlers
  locks: rename locks_remove_flock to locks_remove_file
  locks: show private lock types in /proc/locks
  locks: report l_pid as -1 for FL_FILE_PVT locks
  locks: add new private lock type that is owned by the filp

 fs/file_table.c  |   2 +-
 fs/locks.c   | 321 ++-
 include/linux/fs.h   |  11 +-
 include/uapi/asm-generic/fcntl.h |  19 ++-
 4 files changed, 204 insertions(+), 149 deletions

[PATCH v4 01/13] locks: close potential race between setlease and open

2013-12-19 Thread Jeff Layton
v2:
- fix potential double-free of lease if second check finds conflict
- add smp_mb's to ensure that other CPUs see i_flock changes

v3:
- remove smp_mb calls. Partial ordering is unlikely to help here.

v4:
- add back smp_mb calls. While we have implicit barriers in place
  that enforce this today, it's best to be explicit about it as a
  defensive coding measure.

As Al Viro points out, there is an unlikely, but possible race between
opening a file and setting a lease on it. generic_add_lease is done with
the i_lock held, but the inode-i_flock check in break_lease is
lockless. It's possible for another task doing an open to do the entire
pathwalk and call break_lease between the point where generic_add_lease
checks for a conflicting open and adds the lease to the list. If this
occurs, we can end up with a lease set on the file with a conflicting
open.

To guard against that, check again for a conflicting open after adding
the lease to the i_flock list. If the above race occurs, then we can
simply unwind the lease setting and return -EAGAIN.

Because we take dentry references and acquire write access on the file
before calling break_lease, we know that if the i_flock list is empty
when the open caller goes to check it then the necessary refcounts have
already been incremented. Thus the additional check for a conflicting
open will see that there is one and the setlease call will fail.

Cc: Bruce Fields bfie...@fieldses.org
Cc: David Howells dhowe...@redhat.com
Cc: Paul E. McKenney paul...@linux.vnet.ibm.com
Reported-by: Al Viro v...@zeniv.linux.org.uk
Signed-off-by: Jeff Layton jlay...@redhat.com
Signed-off-by: J. Bruce Fields bfie...@redhat.com
---
 fs/locks.c | 75 --
 include/linux/fs.h |  6 +
 2 files changed, 68 insertions(+), 13 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 92a0f0a..2cfeea6 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -652,15 +652,18 @@ static void locks_insert_lock(struct file_lock **pos, 
struct file_lock *fl)
locks_insert_global_locks(fl);
 }
 
-/*
- * Delete a lock and then free it.
- * Wake up processes that are blocked waiting for this lock,
- * notify the FS that the lock has been cleared and
- * finally free the lock.
+/**
+ * locks_delete_lock - Delete a lock and then free it.
+ * @thisfl_p: pointer that points to the fl_next field of the previous
+ *   inode-i_flock list entry
+ *
+ * Unlink a lock from all lists and free the namespace reference, but don't
+ * free it yet. Wake up processes that are blocked waiting for this lock and
+ * notify the FS that the lock has been cleared.
  *
  * Must be called with the i_lock held!
  */
-static void locks_delete_lock(struct file_lock **thisfl_p)
+static void locks_unlink_lock(struct file_lock **thisfl_p)
 {
struct file_lock *fl = *thisfl_p;
 
@@ -675,6 +678,18 @@ static void locks_delete_lock(struct file_lock **thisfl_p)
}
 
locks_wake_up_blocks(fl);
+}
+
+/*
+ * Unlink a lock from all lists and free it.
+ *
+ * Must be called with i_lock held!
+ */
+static void locks_delete_lock(struct file_lock **thisfl_p)
+{
+   struct file_lock *fl = *thisfl_p;
+
+   locks_unlink_lock(thisfl_p);
locks_free_lock(fl);
 }
 
@@ -1472,6 +1487,32 @@ int fcntl_getlease(struct file *filp)
return type;
 }
 
+/**
+ * check_conflicting_open - see if the given dentry points to a file that has
+ * an existing open that would conflict with the
+ * desired lease.
+ * @dentry:dentry to check
+ * @arg:   type of lease that we're trying to acquire
+ *
+ * Check to see if there's an existing open fd on this file that would
+ * conflict with the lease we're trying to set.
+ */
+static int
+check_conflicting_open(const struct dentry *dentry, const long arg)
+{
+   int ret = 0;
+   struct inode *inode = dentry-d_inode;
+
+   if ((arg == F_RDLCK)  (atomic_read(inode-i_writecount)  0))
+   return -EAGAIN;
+
+   if ((arg == F_WRLCK)  ((d_count(dentry)  1) ||
+   (atomic_read(inode-i_count)  1)))
+   ret = -EAGAIN;
+
+   return ret;
+}
+
 static int generic_add_lease(struct file *filp, long arg, struct file_lock 
**flp)
 {
struct file_lock *fl, **before, **my_before = NULL, *lease;
@@ -1499,12 +1540,8 @@ static int generic_add_lease(struct file *filp, long 
arg, struct file_lock **flp
return -EINVAL;
}
 
-   error = -EAGAIN;
-   if ((arg == F_RDLCK)  (atomic_read(inode-i_writecount)  0))
-   goto out;
-   if ((arg == F_WRLCK)
-((d_count(dentry)  1)
-   || (atomic_read(inode-i_count)  1)))
+   error = check_conflicting_open(dentry, arg);
+   if (error)
goto out;
 
/*
@@ -1549,7 +1586,19 @@ static int generic_add_lease(struct file *filp, long 
arg, struct file_lock **flp
goto out

[PATCH v4 13/13] locks: add new private lock type that is owned by the filp

2013-12-19 Thread Jeff Layton
Due to some unfortunate history, POSIX locks have very strange and
unhelpful semantics. The thing that usually catches people by surprise
is that they are dropped whenever the process closes any file descriptor
associated with the inode.

This is extremely problematic for people developing file servers that
need to implement byte-range locks. Developers often need a lock
management facility to ensure that file descriptors are not closed
until all of the locks associated with the inode are finished.

Additionally, classic POSIX locks are owned by the process. Locks
taken between threads within the same process won't conflict with one
another, which renders them useless for synchronization between threads.

This patchset adds a new type of lock that attempts to address these
issues. These locks conflict with classic POSIX read/write locks, but
have semantics that are more like BSD locks with respect to inheritance
and behavior on close.

This is implemented primarily by changing how fl_owner field is set for
these locks. Instead of having them owned by the files_struct of the
process, they are instead owned by the filp on which they were acquired.
Thus, they are inherited across fork() and are only released when the
last reference to a filp is put.

These new semantics prevent them from being merged with classic POSIX
locks, even if they are acquired by the same process. These locks will
also conflict with classic POSIX locks even if they are acquired by
the same process or on the same file descriptor.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c   | 34 --
 include/uapi/asm-generic/fcntl.h | 16 
 2 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 013b177..bd2d824 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -381,7 +381,6 @@ static int flock64_to_posix_lock(struct file *filp, struct 
file_lock *fl,
} else
fl-fl_end = OFFSET_MAX;
 
-   fl-fl_owner = current-files;
fl-fl_pid = current-tgid;
fl-fl_file = filp;
fl-fl_flags = FL_POSIX;
@@ -391,16 +390,45 @@ static int flock64_to_posix_lock(struct file *filp, 
struct file_lock *fl,
/* Ensure that fl-fl_filp has compatible f_mode */
switch (l-l_type) {
case F_RDLCK:
+   case F_RDLCKP:
if (!(filp-f_mode  FMODE_READ))
return -EBADF;
break;
case F_WRLCK:
+   case F_WRLCKP:
if (!(filp-f_mode  FMODE_WRITE))
return -EBADF;
break;
}
 
-   return assign_type(fl, l-l_type);
+   /*
+* FL_FILE_PVT locks are owned by the filp upon which they were
+* acquired, regardless of what task is dealing with them. Set the
+* fl_owner appropriately and flag them as private.
+*/
+   switch(l-l_type) {
+   case F_RDLCKP:
+   fl-fl_owner = (fl_owner_t)filp;
+   fl-fl_type = F_RDLCK;
+   fl-fl_flags |= FL_FILE_PVT;
+   break;
+   case F_WRLCKP:
+   fl-fl_owner = (fl_owner_t)filp;
+   fl-fl_type = F_WRLCK;
+   fl-fl_flags |= FL_FILE_PVT;
+   break;
+   case F_UNLCKP:
+   fl-fl_owner = (fl_owner_t)filp;
+   fl-fl_type = F_UNLCK;
+   fl-fl_flags |= FL_FILE_PVT;
+   break;
+   default:
+   /* Any other POSIX lock is owned by the file_struct */
+   fl-fl_owner = current-files;
+   return assign_type(fl, l-l_type);
+   }
+
+   return 0;
 }
 
 /* Verify a struct flock and copy it to a struct file_lock as a POSIX
@@ -2207,6 +2235,8 @@ void locks_remove_file(struct file *filp)
if (!inode-i_flock)
return;
 
+   locks_remove_posix(filp, (fl_owner_t)filp);
+
if (filp-f_op-flock) {
struct file_lock fl = {
.fl_pid = current-tgid,
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 36025f7..25eb7be 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -151,6 +151,22 @@ struct f_owner_ex {
 #define F_UNLCK2
 #endif
 
+/*
+ * fd private POSIX locks.
+ *
+ * Usually POSIX locks held by a process are released on *any* close and are
+ * not inherited across a fork().
+ *
+ * These lock types will conflict with normal POSIX locks, but are owned
+ * by the opened file, not the process. This means that they are inherited
+ * across fork() like BSD (flock) locks, and they are only released
+ * automatically when the last reference to the the open file against which
+ * they were acquired is put.
+ */
+#define F_RDLCKP   5
+#define F_WRLCKP   6
+#define F_UNLCKP   7
+
 /* for old implementation of bsd flock () */
 #ifndef F_EXLCK
 #define F_EXLCK4

[PATCH v4 04/13] locks: add __acquires and __releases annotations to locks_start and locks_stop

2013-12-19 Thread Jeff Layton
...to make sparse happy.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/locks.c b/fs/locks.c
index 049a144..6084f5a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2430,6 +2430,7 @@ static int locks_show(struct seq_file *f, void *v)
 }
 
 static void *locks_start(struct seq_file *f, loff_t *pos)
+   __acquires(blocked_lock_lock)
 {
struct locks_iterator *iter = f-private;
 
@@ -2448,6 +2449,7 @@ static void *locks_next(struct seq_file *f, void *v, 
loff_t *pos)
 }
 
 static void locks_stop(struct seq_file *f, void *v)
+   __releases(blocked_lock_lock)
 {
spin_unlock(blocked_lock_lock);
lg_global_unlock(file_lock_lglock);
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 08/13] locks: consolidate checks for compatible filp-f_mode values in setlk handlers

2013-12-19 Thread Jeff Layton
Move this check into flock64_to_posix_lock instead of duplicating it in
two places. This also fixes a minor wart in the code where we continue
referring to the struct flock after converting it to struct file_lock.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 46 --
 1 file changed, 12 insertions(+), 34 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 1555b05..820322d 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -387,6 +387,18 @@ static int flock64_to_posix_lock(struct file *filp, struct 
file_lock *fl,
fl-fl_ops = NULL;
fl-fl_lmops = NULL;
 
+   /* Ensure that fl-fl_filp has compatible f_mode */
+   switch (l-l_type) {
+   case F_RDLCK:
+   if (!(filp-f_mode  FMODE_READ))
+   return -EBADF;
+   break;
+   case F_WRLCK:
+   if (!(filp-f_mode  FMODE_WRITE))
+   return -EBADF;
+   break;
+   }
+
return assign_type(fl, l-l_type);
 }
 
@@ -2024,23 +2036,6 @@ again:
file_lock-fl_flags |= FL_SLEEP;
}

-   error = -EBADF;
-   switch (flock.l_type) {
-   case F_RDLCK:
-   if (!(filp-f_mode  FMODE_READ))
-   goto out;
-   break;
-   case F_WRLCK:
-   if (!(filp-f_mode  FMODE_WRITE))
-   goto out;
-   break;
-   case F_UNLCK:
-   break;
-   default:
-   error = -EINVAL;
-   goto out;
-   }
-
error = do_lock_file_wait(filp, cmd, file_lock);
 
/*
@@ -2142,23 +2137,6 @@ again:
file_lock-fl_flags |= FL_SLEEP;
}

-   error = -EBADF;
-   switch (flock.l_type) {
-   case F_RDLCK:
-   if (!(filp-f_mode  FMODE_READ))
-   goto out;
-   break;
-   case F_WRLCK:
-   if (!(filp-f_mode  FMODE_WRITE))
-   goto out;
-   break;
-   case F_UNLCK:
-   break;
-   default:
-   error = -EINVAL;
-   goto out;
-   }
-
error = do_lock_file_wait(filp, cmd, file_lock);
 
/*
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 10/13] locks: rename locks_remove_flock to locks_remove_file

2013-12-19 Thread Jeff Layton
This function currently removes leases in addition to flock locks and in
a later patch we'll have it deal with a new type of POSIX lock too.
Rename it to locks_remove_file to indicate that it removes locks that
are associated with a particular struct file, and not just flock locks.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/file_table.c| 2 +-
 fs/locks.c | 2 +-
 include/linux/fs.h | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 5fff903..468543c 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -234,7 +234,7 @@ static void __fput(struct file *file)
 * in the file cleanup chain.
 */
eventpoll_release(file);
-   locks_remove_flock(file);
+   locks_remove_file(file);
 
if (unlikely(file-f_flags  FASYNC)) {
if (file-f_op-fasync)
diff --git a/fs/locks.c b/fs/locks.c
index 8180141..a217c2c 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2197,7 +2197,7 @@ EXPORT_SYMBOL(locks_remove_posix);
 /*
  * This function is called on the last close of an open file.
  */
-void locks_remove_flock(struct file *filp)
+void locks_remove_file(struct file *filp)
 {
struct inode * inode = file_inode(filp);
struct file_lock *fl;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 04be202..51974e0 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1012,7 +1012,7 @@ extern struct file_lock * locks_alloc_lock(void);
 extern void locks_copy_lock(struct file_lock *, struct file_lock *);
 extern void __locks_copy_lock(struct file_lock *, const struct file_lock *);
 extern void locks_remove_posix(struct file *, fl_owner_t);
-extern void locks_remove_flock(struct file *);
+extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
 extern void posix_test_lock(struct file *, struct file_lock *);
 extern int posix_lock_file(struct file *, struct file_lock *, struct file_lock 
*);
@@ -1083,7 +1083,7 @@ static inline void locks_remove_posix(struct file *filp, 
fl_owner_t owner)
return;
 }
 
-static inline void locks_remove_flock(struct file *filp)
+static inline void locks_remove_file(struct file *filp)
 {
return;
 }
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 11/13] locks: show private lock types in /proc/locks

2013-12-19 Thread Jeff Layton
In a later patch, we'll be adding a new type of lock that's owned by
the struct file instead of the files_struct. Those sorts of locks
will be flagged with a new IS_FILE_PVT flag.

Show private locks in /proc/locks with a 'P' suffix. This does mean
that we need to widen the column by one character.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 9 ++---
 include/linux/fs.h | 1 +
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index a217c2c..995583b 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -135,6 +135,7 @@
 #define IS_POSIX(fl)   (fl-fl_flags  FL_POSIX)
 #define IS_FLOCK(fl)   (fl-fl_flags  FL_FLOCK)
 #define IS_LEASE(fl)   (fl-fl_flags  (FL_LEASE|FL_DELEG))
+#define IS_FILE_PVT(fl)(fl-fl_flags  FL_FILE_PVT)
 
 static bool lease_breaking(struct file_lock *fl)
 {
@@ -2336,13 +2337,15 @@ static void lock_get_status(struct seq_file *f, struct 
file_lock *fl,
seq_printf(f, UNKNOWN UNKNOWN  );
}
if (fl-fl_type  LOCK_MAND) {
-   seq_printf(f, %s ,
+   seq_printf(f, %s  ,
   (fl-fl_type  LOCK_READ)
   ? (fl-fl_type  LOCK_WRITE) ? RW: READ 
   : (fl-fl_type  LOCK_WRITE) ? WRITE : NONE 
);
+   } else if (IS_FILE_PVT(fl)) {
+   seq_printf(f, %s , fl-fl_type == F_WRLCK ?  WRITEP : 
READP );
} else {
-   seq_printf(f, %s ,
-  (lease_breaking(fl))
+   seq_printf(f, %s  ,
+  lease_breaking(fl)
   ? (fl-fl_type == F_UNLCK) ? UNLCK : READ 
   : (fl-fl_type == F_WRLCK) ? WRITE : READ );
}
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 51974e0..639a3b7 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -888,6 +888,7 @@ static inline int file_check_writeable(struct file *filp)
 #define FL_SLEEP   128 /* A blocking lock */
 #define FL_DOWNGRADE_PENDING   256 /* Lease is being downgraded */
 #define FL_UNLOCK_PENDING  512 /* Lease is being broken */
+#define FL_FILE_PVT1024/* lock is private to the file */
 
 /*
  * Special return value from posix_lock_file() and vfs_lock_file() for
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 12/13] locks: report l_pid as -1 for FL_FILE_PVT locks

2013-12-19 Thread Jeff Layton
FL_FILE_PVT locks are no longer tied to a particular pid, and are
instead inheritable by child processes. Report a l_pid of '-1' for
these sorts of locks since the pid is somewhat meaningless for them.

This precedent comes from FreeBSD. There, POSIX and flock() locks can
conflict with one another. If fcntl(F_GETLK, ...) returns a lock set
with flock() then the l_pid member cannot be a process ID because the
lock is not held by a process as such.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 995583b..013b177 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1863,7 +1863,7 @@ EXPORT_SYMBOL_GPL(vfs_test_lock);
 
 static int posix_lock_to_flock(struct flock *flock, struct file_lock *fl)
 {
-   flock-l_pid = fl-fl_pid;
+   flock-l_pid = IS_FILE_PVT(fl) ? -1 : fl-fl_pid;
 #if BITS_PER_LONG == 32
/*
 * Make sure we can represent the posix lock via
@@ -1885,7 +1885,7 @@ static int posix_lock_to_flock(struct flock *flock, 
struct file_lock *fl)
 #if BITS_PER_LONG == 32
 static void posix_lock_to_flock64(struct flock64 *flock, struct file_lock *fl)
 {
-   flock-l_pid = fl-fl_pid;
+   flock-l_pid = IS_FILE_PVT(fl) ? -1 : fl-fl_pid;
flock-l_start = fl-fl_start;
flock-l_len = fl-fl_end == OFFSET_MAX ? 0 :
fl-fl_end - fl-fl_start + 1;
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 09/13] locks: don't reference original flock struct in F_GETLK handlers

2013-12-19 Thread Jeff Layton
The locks code currently sanity checks the type values in the flock
struct before doing the flock-file_lock conversion. That will be
problematic when new l_type values are introduced in a later patch.

Instead, do the flock_to_posix_lock conversion first, and then sanity
check the values in the file_lock instead.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 820322d..8180141 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1905,14 +1905,15 @@ int fcntl_getlk(struct file *filp, struct flock __user 
*l)
error = -EFAULT;
if (copy_from_user(flock, l, sizeof(flock)))
goto out;
-   error = -EINVAL;
-   if ((flock.l_type != F_RDLCK)  (flock.l_type != F_WRLCK))
-   goto out;
 
error = flock_to_posix_lock(filp, file_lock, flock);
if (error)
goto out;
 
+   error = -EINVAL;
+   if ((file_lock.fl_type != F_RDLCK)  (file_lock.fl_type != F_WRLCK))
+   goto out;
+
error = vfs_test_lock(filp, file_lock);
if (error)
goto out;
@@ -2073,14 +2074,15 @@ int fcntl_getlk64(struct file *filp, struct flock64 
__user *l)
error = -EFAULT;
if (copy_from_user(flock, l, sizeof(flock)))
goto out;
-   error = -EINVAL;
-   if ((flock.l_type != F_RDLCK)  (flock.l_type != F_WRLCK))
-   goto out;
 
error = flock64_to_posix_lock(filp, file_lock, flock);
if (error)
goto out;
 
+   error = -EINVAL;
+   if ((file_lock.fl_type != F_RDLCK)  (file_lock.fl_type != F_WRLCK))
+   goto out;
+
error = vfs_test_lock(filp, file_lock);
if (error)
goto out;
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 05/13] locks: eliminate BUG() call when there's an unexpected lock on file close

2013-12-19 Thread Jeff Layton
A leftover lock on the list is surely a sign of a problem of some sort,
but it's not necessarily a reason to panic the box. Instead, just log a
warning with some info about the lock, and then delete it like we would
any other lock.

In the event that the filesystem declares a -lock f_op, we may end up
leaking something, but that's generally preferable to an immediate
panic.

Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 24 ++--
 1 file changed, 18 insertions(+), 6 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 6084f5a..dd30933 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2281,16 +2281,28 @@ void locks_remove_flock(struct file *filp)
 
while ((fl = *before) != NULL) {
if (fl-fl_file == filp) {
-   if (IS_FLOCK(fl)) {
-   locks_delete_lock(before);
-   continue;
-   }
if (IS_LEASE(fl)) {
lease_modify(before, F_UNLCK);
continue;
}
-   /* What? */
-   BUG();
+
+   /*
+* There's a leftover lock on the list of a type that
+* we didn't expect to see. Most likely a classic
+* POSIX lock that ended up not getting released
+* properly, or that raced onto the list somehow. Log
+* some info about it and then just remove it from
+* the list.
+*/
+   WARN(!IS_FLOCK(fl),
+   leftover lock: dev=%u:%u ino=%lu type=%hhd 
flags=0x%x start=%lld end=%lld\n,
+   MAJOR(inode-i_sb-s_dev),
+   MINOR(inode-i_sb-s_dev), inode-i_ino,
+   fl-fl_type, fl-fl_flags,
+   fl-fl_start, fl-fl_end);
+
+   locks_delete_lock(before);
+   continue;
}
before = fl-fl_next;
}
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 06/13] locks: consolidate common code in the flock_to_posix_lock routines

2013-12-19 Thread Jeff Layton
From: J. Bruce Fields bfie...@fieldses.org

Currently, there's a lot of copy and paste between the two.  Add some
functions to do the initialization of the file_lock from values
passed in, and turn the flock/flock64 variants of those functions into
wrappers around them.

Signed-off-by: Jeff Layton jlay...@redhat.com
Signed-off-by: J. Bruce Fields bfie...@fieldses.org
---
 fs/locks.c   | 112 +++
 include/uapi/asm-generic/fcntl.h |   3 --
 2 files changed, 44 insertions(+), 71 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index dd30933..863f4df 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,48 +344,42 @@ static int assign_type(struct file_lock *fl, long type)
return 0;
 }
 
-/* Verify a struct flock and copy it to a struct file_lock as a POSIX
- * style lock.
- */
-static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
-  struct flock *l)
+static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
+   struct flock64 *l, loff_t offset_max)
 {
-   off_t start, end;
-
switch (l-l_whence) {
case SEEK_SET:
-   start = 0;
+   fl-fl_start = 0;
break;
case SEEK_CUR:
-   start = filp-f_pos;
+   fl-fl_start = filp-f_pos;
break;
case SEEK_END:
-   start = i_size_read(file_inode(filp));
+   fl-fl_start = i_size_read(file_inode(filp));
break;
default:
return -EINVAL;
}
+   if (l-l_start  offset_max - fl-fl_start)
+   return -EOVERFLOW;
+   fl-fl_start += l-l_start;
+   if (fl-fl_start  0)
+   return -EINVAL;
+   if (l-l_len  offset_max - fl-fl_start)
+   return -EOVERFLOW;
+   if (fl-fl_start + l-l_len  0)
+   return -EINVAL;
 
/* POSIX-1996 leaves the case l-l_len  0 undefined;
   POSIX-2001 defines it. */
-   start += l-l_start;
-   if (start  0)
-   return -EINVAL;
-   fl-fl_end = OFFSET_MAX;
-   if (l-l_len  0) {
-   end = start + l-l_len - 1;
-   fl-fl_end = end;
-   } else if (l-l_len  0) {
-   end = start - 1;
-   fl-fl_end = end;
-   start += l-l_len;
-   if (start  0)
-   return -EINVAL;
-   }
-   fl-fl_start = start;   /* we record the absolute position */
-   if (fl-fl_end  fl-fl_start)
-   return -EOVERFLOW;
-   
+   if (l-l_len  0)
+   fl-fl_end = fl-fl_start + l-l_len - 1;
+   else if (l-l_len  0) {
+   fl-fl_end = fl-fl_start - 1;
+   fl-fl_start += l-l_len;
+   } else
+   fl-fl_end = OFFSET_MAX;
+
fl-fl_owner = current-files;
fl-fl_pid = current-tgid;
fl-fl_file = filp;
@@ -396,50 +390,32 @@ static int flock_to_posix_lock(struct file *filp, struct 
file_lock *fl,
return assign_type(fl, l-l_type);
 }
 
+/* Verify a struct flock and copy it to a struct file_lock as a POSIX
+ * style lock.
+ */
+static int flock_to_posix_lock(struct file *filp, struct file_lock *fl,
+  struct flock *l)
+{
+   struct flock64 ll = {
+   .l_type = l-l_type,
+   .l_whence = l-l_whence,
+   .l_start = l-l_start,
+   .l_len = l-l_len,
+   };
+
+   /*
+* The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
+* if the start or end of the lock could not be represented as
+* an off_t, following SUSv3.
+*/
+   return flock_to_posix_lock_common(filp, fl, ll, OFFT_OFFSET_MAX);
+}
+
 #if BITS_PER_LONG == 32
 static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
 struct flock64 *l)
 {
-   loff_t start;
-
-   switch (l-l_whence) {
-   case SEEK_SET:
-   start = 0;
-   break;
-   case SEEK_CUR:
-   start = filp-f_pos;
-   break;
-   case SEEK_END:
-   start = i_size_read(file_inode(filp));
-   break;
-   default:
-   return -EINVAL;
-   }
-
-   start += l-l_start;
-   if (start  0)
-   return -EINVAL;
-   fl-fl_end = OFFSET_MAX;
-   if (l-l_len  0) {
-   fl-fl_end = start + l-l_len - 1;
-   } else if (l-l_len  0) {
-   fl-fl_end = start - 1;
-   start += l-l_len;
-   if (start  0)
-   return -EINVAL;
-   }
-   fl-fl_start = start;   /* we record the absolute position */
-   if (fl-fl_end  fl-fl_start)
-   return -EOVERFLOW;
-   
-   fl-fl_owner = current-files;
-   fl-fl_pid = current-tgid;
-   fl-fl_file = filp;
-   fl-fl_flags

[PATCH v4 07/13] locks: simplify overflow checking

2013-12-19 Thread Jeff Layton
From: J. Bruce Fields bfie...@fieldses.org

Or maybe we don't actually care about indicating overflow in the 32-bit
case: sure we could fail if e.g. f_pos+start or f_pos+start+len would
exceed 32-bits, but do we really need to?

Signed-off-by: J. Bruce Fields bfie...@redhat.com
---
 fs/locks.c | 23 +--
 1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 863f4df..1555b05 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -344,8 +344,8 @@ static int assign_type(struct file_lock *fl, long type)
return 0;
 }
 
-static int flock_to_posix_lock_common(struct file *filp, struct file_lock *fl,
-   struct flock64 *l, loff_t offset_max)
+static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
+struct flock64 *l)
 {
switch (l-l_whence) {
case SEEK_SET:
@@ -360,12 +360,12 @@ static int flock_to_posix_lock_common(struct file *filp, 
struct file_lock *fl,
default:
return -EINVAL;
}
-   if (l-l_start  offset_max - fl-fl_start)
+   if (l-l_start  OFFSET_MAX - fl-fl_start)
return -EOVERFLOW;
fl-fl_start += l-l_start;
if (fl-fl_start  0)
return -EINVAL;
-   if (l-l_len  offset_max - fl-fl_start)
+   if (l-l_len  OFFSET_MAX - fl-fl_start)
return -EOVERFLOW;
if (fl-fl_start + l-l_len  0)
return -EINVAL;
@@ -403,22 +403,9 @@ static int flock_to_posix_lock(struct file *filp, struct 
file_lock *fl,
.l_len = l-l_len,
};
 
-   /*
-* The use of OFFT_OFFSET_MAX here ensures we return -EOVERFLOW
-* if the start or end of the lock could not be represented as
-* an off_t, following SUSv3.
-*/
-   return flock_to_posix_lock_common(filp, fl, ll, OFFT_OFFSET_MAX);
+   return flock64_to_posix_lock(filp, fl, ll);
 }
 
-#if BITS_PER_LONG == 32
-static int flock64_to_posix_lock(struct file *filp, struct file_lock *fl,
-struct flock64 *l)
-{
-   return flock_to_posix_lock_common(filp, fl, l, OFFSET_MAX);
-}
-#endif
-
 /* default lease lock manager operations */
 static void lease_break_callback(struct file_lock *fl)
 {
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 03/13] locks: remove inline qualifier from fl_link manipulation functions

2013-12-19 Thread Jeff Layton
It's best to let the compiler decide that.

Reported-by: Stephen Rothwell s...@canb.auug.org.au
Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 5e28612..049a144 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -511,8 +511,7 @@ static int posix_same_owner(struct file_lock *fl1, struct 
file_lock *fl2)
 }
 
 /* Must be called with the i_lock held! */
-static inline void
-locks_insert_global_locks(struct file_lock *fl)
+static void locks_insert_global_locks(struct file_lock *fl)
 {
lg_local_lock(file_lock_lglock);
fl-fl_link_cpu = smp_processor_id();
@@ -521,8 +520,7 @@ locks_insert_global_locks(struct file_lock *fl)
 }
 
 /* Must be called with the i_lock held! */
-static inline void
-locks_delete_global_locks(struct file_lock *fl)
+static void locks_delete_global_locks(struct file_lock *fl)
 {
/*
 * Avoid taking lock if already unhashed. This is safe since this check
@@ -544,14 +542,12 @@ posix_owner_key(struct file_lock *fl)
return (unsigned long)fl-fl_owner;
 }
 
-static inline void
-locks_insert_global_blocked(struct file_lock *waiter)
+static void locks_insert_global_blocked(struct file_lock *waiter)
 {
hash_add(blocked_hash, waiter-fl_link, posix_owner_key(waiter));
 }
 
-static inline void
-locks_delete_global_blocked(struct file_lock *waiter)
+static void locks_delete_global_blocked(struct file_lock *waiter)
 {
hash_del(waiter-fl_link);
 }
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v4 02/13] locks: clean up comment typo

2013-12-19 Thread Jeff Layton
Signed-off-by: Jeff Layton jlay...@redhat.com
---
 fs/locks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 2cfeea6..5e28612 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -581,7 +581,7 @@ static void locks_delete_block(struct file_lock *waiter)
  * it seems like the reasonable thing to do.
  *
  * Must be called with both the i_lock and blocked_lock_lock held. The fl_block
- * list itself is protected by the file_lock_list, but by ensuring that the
+ * list itself is protected by the blocked_lock_lock, but by ensuring that the
  * i_lock is also held on insertions we can avoid taking the blocked_lock_lock
  * in some cases when we see that the fl_block list is empty.
  */
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Makefile: fix order of INITRD_COMPRESS-y assignments

2013-12-20 Thread Jeff Layton
Commit 7ac181568342ec (fix build with make 3.80) changed how the
Makefile handles the INITRD_COMPRESS-y assignment in order to make it
work properly with older versions of make. Unfortunately, it also
changes the behavior of how that assignment works when multiple
CONFIG_RD_* options are set, such that they are now preferred in the
reverse order of the way they were before.

Reverse the order of assignments to try and preserve the earlier
precedence in this situation.

Cc: Jan Beulich jbeul...@suse.com
Cc: P J P ppan...@redhat.com
Cc: Andrew Morton a...@linux-foundation.org
Signed-off-by: Jeff Layton jlay...@redhat.com
---
 Makefile | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 89cee86..fb76899 100644
--- a/Makefile
+++ b/Makefile
@@ -733,11 +733,11 @@ export mod_strip_cmd
 # This shall be used by the dracut(8) tool while creating an initramfs image.
 #
 INITRD_COMPRESS-y  := gzip
-INITRD_COMPRESS-$(CONFIG_RD_BZIP2) := bzip2
-INITRD_COMPRESS-$(CONFIG_RD_LZMA)  := lzma
-INITRD_COMPRESS-$(CONFIG_RD_XZ):= xz
-INITRD_COMPRESS-$(CONFIG_RD_LZO)   := lzo
 INITRD_COMPRESS-$(CONFIG_RD_LZ4)   := lz4
+INITRD_COMPRESS-$(CONFIG_RD_LZO)   := lzo
+INITRD_COMPRESS-$(CONFIG_RD_XZ):= xz
+INITRD_COMPRESS-$(CONFIG_RD_LZMA)  := lzma
+INITRD_COMPRESS-$(CONFIG_RD_BZIP2) := bzip2
 export INITRD_COMPRESS := $(INITRD_COMPRESS-y)
 
 ifdef CONFIG_MODULE_SIG_ALL
-- 
1.8.4.2

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken initrd compression settings in 3.13

2013-12-20 Thread Jeff Layton
On Fri, 20 Dec 2013 18:15:29 -0800
Linus Torvalds torva...@linux-foundation.org wrote:

 On Fri, Dec 20, 2013 at 5:39 PM, Andrew Morton
 a...@linux-foundation.org wrote:
  On Fri, 20 Dec 2013 16:41:43 -0800 Linus Torvalds 
  torva...@linux-foundation.org wrote:
 
   (a) most sane people don't even have lz4 _installed_, so dracut won't
  actually succeed
 
   (b) there's no way to select the compression level (unlike the
  INITRAMFS_COMPRESSION thing that actually has a choice)
 
   (c) even if you *do* have lz4, it doesn't actually work, because
  while that causes the new F20 dracut to compress the initramfs with
  lz4, the end result is completely broken, because the F20 lsinitrd
  scripts don't understand the end result, so now the whole kernel
  install fails.
 
  (a) and (b) are very much kernel bugs.
 
  Jeff sent the below this morning.  Will that fix (a)?
 
 Yes, it fixes (a), at least to some degree, in that at least
 defaulting to bzip2 is a lot more sane than defaulting to lz4. I
 suspect most everybody has bzip2 installed. And at least on my current
 F20 install, it looks like lsinitrd understands to use zcat, bzcat or
 xzcat on the resulting initrd image (and bzcat does that bzip2
 decoding).
 
 So I think Jeff's patch at least fixes the symptoms.
 
 That said, I think it does nothing *but* fix the symptoms, and we're
 actually still better off with the 3.12 behavior which was to never
 set INITRD_COMPRESS at all. Because quite frankly, there's currently
 no way for the kernel to know what the right compressor is. bz2 may
 well work, but can you guarantee it? I certainly can't..
 
 Now, if we asked the user, that would be a different thing. But right
 now we very much don't ask the user, and we just pick one at random.
 
 We're better off not picking a compression method at all, at which
 point the distro installkernel will do whatever the distro does.
 
 Linus

Perhaps a better solution for this would be to instead export an
env var with a list of the compression algorithms that the kernel
supports. Then installkernel or dracut could use that info to make a
semi-intelligent decision based on that and what tools are installed.

...or maybe a separate env var for each one that it supports:

$INITRD_COMPRESS_LZ4
$INITRD_COMPRESS_BZIP2
$INITRD_COMPRESS_GZIP

...etc.

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Broken initrd compression settings in 3.13

2013-12-20 Thread Jeff Layton
On Fri, 20 Dec 2013 18:15:29 -0800
Linus Torvalds torva...@linux-foundation.org wrote:

 On Fri, Dec 20, 2013 at 5:39 PM, Andrew Morton
 a...@linux-foundation.org wrote:
  On Fri, 20 Dec 2013 16:41:43 -0800 Linus Torvalds 
  torva...@linux-foundation.org wrote:
 
   (a) most sane people don't even have lz4 _installed_, so dracut won't
  actually succeed
 
   (b) there's no way to select the compression level (unlike the
  INITRAMFS_COMPRESSION thing that actually has a choice)
 
   (c) even if you *do* have lz4, it doesn't actually work, because
  while that causes the new F20 dracut to compress the initramfs with
  lz4, the end result is completely broken, because the F20 lsinitrd
  scripts don't understand the end result, so now the whole kernel
  install fails.
 
  (a) and (b) are very much kernel bugs.
 
  Jeff sent the below this morning.  Will that fix (a)?
 
 Yes, it fixes (a), at least to some degree, in that at least
 defaulting to bzip2 is a lot more sane than defaulting to lz4. I
 suspect most everybody has bzip2 installed. And at least on my current
 F20 install, it looks like lsinitrd understands to use zcat, bzcat or
 xzcat on the resulting initrd image (and bzcat does that bzip2
 decoding).
 
 So I think Jeff's patch at least fixes the symptoms.
 
 That said, I think it does nothing *but* fix the symptoms, and we're
 actually still better off with the 3.12 behavior which was to never
 set INITRD_COMPRESS at all. Because quite frankly, there's currently
 no way for the kernel to know what the right compressor is. bz2 may
 well work, but can you guarantee it? I certainly can't..
 
 Now, if we asked the user, that would be a different thing. But right
 now we very much don't ask the user, and we just pick one at random.
 
 We're better off not picking a compression method at all, at which
 point the distro installkernel will do whatever the distro does.
 
 Linus

That works for me, you can drop my patch...

But, what's the point of setting INITRD_COMPRESS-y at all if you're not
going to export INITRD_COMPRESS? Would it be better to just remove that
entire block in the Makefile (i.e. just revert PJP's patch?)

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: fix page leak at nfs_symlink()

2014-02-07 Thread Jeff Layton
On Fri,  7 Feb 2014 13:19:54 -0200
Rafael Aquini aqu...@redhat.com wrote:

 Changes committed by a0b8cab3 mm: remove lru parameter from
 __pagevec_lru_add and remove parts of pagevec API have introduced
 a call to add_to_page_cache_lru() which causes a leak in nfs_symlink() 
 as now the page gets an extra refcount that is not dropped.
 
 Jan Stancek observed and reported the leak effect while running test8 from
 Connectathon Testsuite. After several iterations over the test case,
 which creates several symlinks on a NFS mountpoint, the test system was
 quickly getting into an out-of-memory scenario.
 
 This patch fixes the page leak by dropping that extra refcount 
 add_to_page_cache_lru() is grabbing. 
 
 Signed-off-by: Jan Stancek jstan...@redhat.com
 Signed-off-by: Rafael Aquini aqu...@redhat.com
 ---
  fs/nfs/dir.c | 5 +
  1 file changed, 5 insertions(+)
 
 diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
 index be38b57..4a48fe4 100644
 --- a/fs/nfs/dir.c
 +++ b/fs/nfs/dir.c
 @@ -1846,6 +1846,11 @@ int nfs_symlink(struct inode *dir, struct dentry 
 *dentry, const char *symname)
   GFP_KERNEL)) {
   SetPageUptodate(page);
   unlock_page(page);
 + /*
 +  * add_to_page_cache_lru() grabs an extra page refcount.
 +  * Drop it here to avoid leaking this page later.
 +  */
 + page_cache_release(page);
   } else
   __free_page(page);
  

Looks reasonable as an interim fix and should almost certainly go to
stable.

Longer term, I think it would be best from an API standpoint to fix
add_to_page_cache_lru not to take this extra reference (or to have it
drop it itself) and fix up the callers accordingly. That seems like a
trap for the unwary...

-- 
Jeff Layton jlay...@redhat.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH resend #7] fcntl-linux.h: add new definitions and manual updates for open file description locks

2014-07-30 Thread Jeff Layton
On Tue, 29 Jul 2014 23:30:36 -0400
Carlos O'Donell car...@redhat.com wrote:

 On 07/23/2014 02:21 PM, Jeff Layton wrote:
  From: Jeff Layton jlay...@poochiereds.net
 
 Thanks for resending. Sorry for the delay.
 
 Your use of 3 different emails caused me to miss the recent
 resends. That's my fault and tied to the way I'm tracking
 everything from patchwork using the first email you used.
 

Sorry about that. I was hoping to get this wrapped up before changing
jobs, but that obviously didn't happen...

 I am assuming that this work is under Red Hat's copyright
 status since you submitted it originally from @redhat.com.
 

Yeah, that should be fine. This work was all done under the aegis of
Red Hat. The small tweaks you mention below are also OK. Thanks for
getting this merged!


 I've committed this to trunk with small tweaks and the following
 NEWS entry:
 
 * Support for file description locks is added to systems running the
   Linux kernel. The standard file locking interfaces are extended to
   operate on file descriptions, not file descriptors, via the use of
   F_OFD_GETLK, F_OFD_SETLK, and F_OFD_SETLKW. File description locks
   are associated with an open file instead of a process.
 
 This will be in 2.20 when we cut the branch.
  
  Open file description locks have been merged into the Linux kernel
  for v3.15.  Add the appropriate command-value definitions and an
  update to the manual that describes their usage.
  
  ChangeLog:
  
  2014-04-24  Jeff Layton  jlay...@poochiereds.net
  
  [BZ#16839]
  * manual/llio.texi: add section about open file description
  locks
  
  * manual/examples/ofdlocks.c:
example of open file description lock usage
  
  * sysdeps/unix/sysv/linux/bits/fcntl-linux.h:
(F_OFD_GETLK, F_OFD_SETLK, F_OFD_SETLKW): New macros.
 
 I simplified it a bit.
 
 A space in the ChangeLog indicates a distinct commit, which this is
 not, so I lump them together.
 
 This is what went in:
 
 2014-07-29  Jeff Layton  jlay...@poochiereds.net
 
 [BZ #16839]
 * manual/llio.texi: Add section about open file description
 locks.
 * manual/examples/ofdlocks.c: Example of open file description
 lock usage.
 * sysdeps/unix/sysv/linux/bits/fcntl-linux.h: Define
 F_OFD_GETLK, F_OFD_SETLK, and F_OFD_SETLKW.
 
 As the committer I added #16839 to the fixed bug list following:
 
 https://sourceware.org/glibc/wiki/Committer%20checklist
 
  ---
   manual/examples/ofdlocks.c |  77 +
   manual/llio.texi   | 241
  -
  sysdeps/unix/sysv/linux/bits/fcntl-linux.h |  17 ++ 3 files
  changed, 332 insertions(+), 3 deletions(-) create mode 100644
  manual/examples/ofdlocks.c
  
  diff --git a/manual/examples/ofdlocks.c b/manual/examples/ofdlocks.c
  new file mode 100644
  index ..85e193cdabe6
  --- /dev/null
  +++ b/manual/examples/ofdlocks.c
  @@ -0,0 +1,77 @@
  +/* Open File Description Locks Usage Example
  +   Copyright (C) 1991-2014 Free Software Foundation, Inc.
  +
  +   This program is free software; you can redistribute it and/or
  +   modify it under the terms of the GNU General Public License
  +   as published by the Free Software Foundation; either version 2
  +   of the License, or (at your option) any later version.
  +
  +   This program is distributed in the hope that it will be useful,
  +   but WITHOUT ANY WARRANTY; without even the implied warranty of
  +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  +   GNU General Public License for more details.
  +
  +   You should have received a copy of the GNU General Public
  License
  +   along with this program; if not, see
  http://www.gnu.org/licenses/. +*/
  +
  +#define _GNU_SOURCE
  +#include stdio.h
  +#include sys/types.h
  +#include sys/stat.h
  +#include unistd.h
  +#include fcntl.h
  +#include pthread.h
  +
  +#define FILENAME   /tmp/foo
  +#define NUM_THREADS3
  +#define ITERATIONS 5
  +
  +void *
  +thread_start (void *arg)
  +{
  +  int i, fd, len;
  +  long tid = (long) arg;
  +  char buf[256];
  +  struct flock lck = {
  +.l_whence = SEEK_SET,
  +.l_start = 0,
  +.l_len = 1,
  +  };
  +
  +  fd = open (/tmp/foo, O_RDWR | O_CREAT, 0666);
  +
  +  for (i = 0; i  ITERATIONS; i++)
  +{
  +  lck.l_type = F_WRLCK;
  +  fcntl (fd, F_OFD_SETLKW, lck);
  +
  +  len = sprintf (buf, %d: tid=%ld fd=%d\n, i, tid, fd);
  +
  +  lseek (fd, 0, SEEK_END);
  +  write (fd, buf, len);
  +  fsync (fd);
  +
  +  lck.l_type = F_UNLCK;
  +  fcntl (fd, F_OFD_SETLK, lck);
  +
  +  /* sleep to ensure lock is yielded to another thread */
  +  usleep (1);
  +}
  +  pthread_exit (NULL);
  +}
  +
  +int
  +main (int argc, char **argv)
  +{
  +  long i;
  +  pthread_t threads[NUM_THREADS];
  +
  +  truncate (FILENAME, 0);
  +
  +  for (i = 0; i  NUM_THREADS; i++)
  +pthread_create (threads[i], NULL, thread_start, (void

[PATCH v2] net: clean up some sparse endianness warnings in ipv6.h

2014-07-16 Thread Jeff Layton
sparse is throwing warnings when building sunrpc modules due to some
endianness shenanigans in ipv6.h. Specifically:

  CHECK   net/sunrpc/addr.c
include/net/ipv6.h:573:17: warning: restricted __be64 degrades to integer
include/net/ipv6.h:577:34: warning: restricted __be32 degrades to integer
include/net/ipv6.h:573:17: warning: restricted __be64 degrades to integer
include/net/ipv6.h:577:34: warning: restricted __be32 degrades to integer

Sprinkle some endianness fixups to silence them. These should all get
fixed up at compile time, so I don't think this will add any extra work
to be done at runtime.

Signed-off-by: Jeff Layton jlay...@primarydata.com
---
 include/net/ipv6.h | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 574337fe72dd..0535da61fe0b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -557,24 +557,29 @@ static inline u32 __ipv6_addr_jhash(const struct in6_addr 
*a, const u32 initval)
 static inline bool ipv6_addr_loopback(const struct in6_addr *a)
 {
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
-   const unsigned long *ul = (const unsigned long *)a;
+   const __be64 *be = (const __be64 *)a;
 
-   return (ul[0] | (ul[1] ^ cpu_to_be64(1))) == 0UL;
+   return (be[0] | (be[1] ^ cpu_to_be64(1))) == 0UL;
 #else
return (a-s6_addr32[0] | a-s6_addr32[1] |
-   a-s6_addr32[2] | (a-s6_addr32[3] ^ htonl(1))) == 0;
+   a-s6_addr32[2] | (a-s6_addr32[3] ^ cpu_to_be32(1))) == 0;
 #endif
 }
 
+/*
+ * Note that we must __force cast these to unsigned long to make sparse happy,
+ * since all of the endian-annotated types are fixed size regardless of arch.
+ */
 static inline bool ipv6_addr_v4mapped(const struct in6_addr *a)
 {
return (
 #if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)  BITS_PER_LONG == 64
-   *(__be64 *)a |
+   *(unsigned long *)a |
 #else
-   (a-s6_addr32[0] | a-s6_addr32[1]) |
+   (__force unsigned long)(a-s6_addr32[0] | a-s6_addr32[1]) |
 #endif
-   (a-s6_addr32[2] ^ htonl(0x))) == 0UL;
+   (__force unsigned long)(a-s6_addr32[2] ^
+   cpu_to_be32(0x))) == 0UL;
 }
 
 /*
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Jeff Layton
 v1.4.13 (GNU/Linux)
  Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
  
  iQEcBAEBAgAGBQJT2rLiAAoJELbHqkYeJT4OqPgH/0taKW6Be90c1mETZf9yeqZF
  YMLZk8XC2wloEd9nVz//mXREmiu18Hc+5p7Upd4Os21J2P4PBMGV6P/9DMxxehwH
  YX1HKha0EoAsbO5ILQhbLf83cRXAPEpvJPgYHrq6xjlKB8Q8OxxND37rY7kl19Zz
  sdAw6GiqHICF3Hq1ATa/jvixMluDnhER9Dln3wOdAGzmmuFYqpTsV4EwzbKKqInJ
  6C15q+cq/9aYh6usN6z2qJhbHgqM9EWcPL6jOrCwX4PbC1XjKHekpFN0t9oKQClx
  qSPuweMQ7fP4IBd2Ke8L/QlyOVblAKSE7t+NdrjfzLmYPzyHTyfLABR/BI053to=
  =/9FJ
  -END PGP SIGNATURE-
 


-- 
Jeff Layton jlay...@poochiereds.net


signature.asc
Description: PGP signature


Re: Killing process in D state on mount to dead NFS server. (when process is in fsync)

2014-08-01 Thread Jeff Layton
On Fri, 1 Aug 2014 20:50:13 -0500
Roger Heflin rogerhef...@gmail.com wrote:

 Doesn't NFS have an intr flag to allow kill -9 to work?   Whenever I
 have had that set it has appeared to work after about 30 seconds or
 so...without that kill -9 does not work when the nfs server is
 missing.
 
 

Not anymore. That mount option has been deprecated (and ignored) for
years. The code in the RPC engine will generally give up in the face of
fatal signals. In this case though, we're in uninterruptible sleep in
the bowels of the writeback code.

The problem here is not really specific to NFS, per-se -- it just
happens to be the filesystem that most people notice it on.

 
 On Fri, Aug 1, 2014 at 8:21 PM, Jeff Layton jlay...@poochiereds.net wrote:
  On Fri, 1 Aug 2014 07:50:53 +1000
  NeilBrown ne...@suse.de wrote:
 
  On Thu, 31 Jul 2014 14:20:07 -0700 Ben Greear gree...@candelatech.com 
  wrote:
 
   -BEGIN PGP SIGNED MESSAGE-
   Hash: SHA1
  
   On 07/31/2014 01:42 PM, NeilBrown wrote:
On Thu, 31 Jul 2014 11:00:35 -0700 Ben Greear 
gree...@candelatech.com wrote:
   
So, this has been asked all over the interweb for years and years, 
but the best answer I can find is to reboot the system or create a 
fake NFS server
somewhere with the same IP as the gone-away NFS server.
   
The problem is:
   
I have some mounts to an NFS server that no longer exists 
(crashed/powered down).
   
I have some processes stuck trying to write to files open on these 
mounts.
   
I want to kill the process and unmount.
   
umount -l will make the mount go a way, sort of.  But process is 
still hung. umount -f complains: umount2:  Device or resource busy 
umount.nfs: /mnt/foo:
device is busy
   
kill -9 does not work on process.
   
Kill -1 should work (since about 2.6.25 or so).
  
   That is -[ONE], right?  Assuming so, it did not work for me.
 
  No, it was -9  sorry, I really shouldn't be let out without my proof
  reader.
 
  However the 'stack' is sufficient to see what is going on.
 
  The problem is that it is blocked inside the VM well away from NFS and
  there is no way for NFS to say give up and go home.
 
  I'd suggest that is a bug.   I cannot see any justification for fsync to 
  not
  be killable.
  It wouldn't be too hard to create a patch to make it so.
  It would be a little harder to examine all call paths and create a
  convincing case that the patch was safe.
  It might be herculean task to convince others that it was the right thing
  to do so let's start with that one.
 
  Hi Linux-mm and fs-devel people.  What do people think of making fsync 
  and
  variants KILLABLE ??
 
  I probably only need a little bit of encouragement to write a patch
 
  Thanks,
  NeilBrown
 
 
 
  It would be good to fix this in some fashion once and for all, and the
  wait_on_page_writeback wait is a major source of pain for a lot of
  people.
 
  So to summarize...
 
  The problem in a nutshell is that Ben has some cached writes to the
  NFS server, but the server has gone away (presumably forever). The
  question is -- how do we communicate to the kernel that that server
  isn't coming back and that those dirty pages should be invalidated so
  that we can umount the filesystem?
 
  Allowing fsync/close to be killable sounds reasonable to me as at least
  a partial solution. Both close(2) and fsync(2) are allowed to return
  EINTR according to the POSIX spec. Allowing a kill -9 there seems
  like it should be fine, and maybe we ought to even consider letting it
  be susceptible to lesser signals.
 
  That still leaves some open questions though...
 
  Is that enough to fix it? You'd still have the dirty pages lingering
  around, right? Would a umount -f presumably work at that point?
 
  
   Kernel is 3.14.4+, with some of extra patches, but probably nothing that
   influences this particular behaviour.
  
   [root@lf1005-14010010 ~]# cat /proc/3805/stack
   [811371ba] sleep_on_page+0x9/0xd
   [8113738e] wait_on_page_bit+0x71/0x78
   [8113769a] filemap_fdatawait_range+0xa2/0x16d
   [8113780e] filemap_write_and_wait_range+0x3b/0x77
   [a0f04734] nfs_file_fsync+0x37/0x83 [nfs]
   [811a8d32] vfs_fsync_range+0x19/0x1b
   [811a8d4b] vfs_fsync+0x17/0x19
   [a0f05305] nfs_file_flush+0x6b/0x6f [nfs]
   [81183e46] filp_close+0x3f/0x71
   [8119c8ae] __close_fd+0x80/0x98
   [81183de5] SyS_close+0x1c/0x3e
   [815c55f9] system_call_fastpath+0x16/0x1b
   [] 0x
   [root@lf1005-14010010 ~]# kill -1 3805
   [root@lf1005-14010010 ~]# cat /proc/3805/stack
   [811371ba] sleep_on_page+0x9/0xd
   [8113738e] wait_on_page_bit+0x71/0x78
   [8113769a] filemap_fdatawait_range+0xa2/0x16d
   [8113780e] filemap_write_and_wait_range+0x3b/0x77
   [a0f04734] nfs_file_fsync+0x37/0x83 [nfs]
   [811a8d32

[PATCH 0/5] locks: move most locks_release_private calls outside of i_lock

2014-08-12 Thread Jeff Layton
In the days of yore, the file locking code was primarily protected by
the BKL. That changed in commit 72f98e72551fa (locks: turn lock_flocks
into a spinlock), at which point the code was changed to be protected
by a conventional spinlock (mostly due to a push to finally eliminate
the BKL). Since then, the code has been changed to use the i_lock
instead of a global spinlock, but it's still under a spinlock.

With that change, several functions now no longer can block when they
originally could. This is a particular problem with the
fl_release_private operation. In NFSv4, that operation is used to kick
off a RELEASE_LOCKOWNER or FREE_STATEID call, and that requires being
able to do an allocation.

This was reported by Josh Stone here:

https://bugzilla.redhat.com/show_bug.cgi?id=1089092

My initial stab at fixing this involved moving this to a workqueue, but
Trond pointed out the above change was technically a regression with the
way the spinlocking in the file locking code works, and suggested an
alternate approach to fixing it.

This set focuses on moving most of the locks_release_private calls
outside of the inode-i_lock. There are still a few that are done
under the i_lock in the lease handling code. Cleaning up the use of
the i_lock in the lease code is a larger project which we'll have to
tackle at some point, but there are some other cleanups that will
need to happen first.

Absent any objections, I'll plan to merge these for 3.18.

Jeff Layton (5):
  locks: don't call locks_release_private from locks_copy_lock
  locks: don't reuse file_lock in __posix_lock_file
  locks: defer freeing locks in locks_delete_lock until after i_lock has
been dropped
  locks: move locks_free_lock calls in do_fcntl_add_lease outside
spinlock
  locks: update Locking documentation to clarify fl_release_private
behavior

 Documentation/filesystems/Locking |  6 ++-
 fs/locks.c| 80 +--
 2 files changed, 57 insertions(+), 29 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 5/5] locks: update Locking documentation to clarify fl_release_private behavior

2014-08-12 Thread Jeff Layton
Signed-off-by: Jeff Layton jlay...@primarydata.com
---
 Documentation/filesystems/Locking | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index b18dd1779029..f1997e9da61f 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -349,7 +349,11 @@ prototypes:
 locking rules:
inode-i_lock   may block
 fl_copy_lock:  yes no
-fl_release_private:maybe   no
+fl_release_private:maybe   maybe[1]
+
+[1]:   -fl_release_private for flock or POSIX locks is currently allowed
+to block. Leases however can still be freed while the i_lock is held and
+so fl_release_private called on a lease should not block.
 
 --- lock_manager_operations ---
 prototypes:
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/5] locks: move locks_free_lock calls in do_fcntl_add_lease outside spinlock

2014-08-12 Thread Jeff Layton
There's no need to call locks_free_lock here while still holding the
i_lock. Defer that until the lock has been dropped.

Signed-off-by: Jeff Layton jlay...@primarydata.com
---
 fs/locks.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 4ce087cca501..cb66fb05ad4a 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1761,13 +1761,10 @@ static int do_fcntl_add_lease(unsigned int fd, struct 
file *filp, long arg)
ret = fl;
spin_lock(inode-i_lock);
error = __vfs_setlease(filp, arg, ret);
-   if (error) {
-   spin_unlock(inode-i_lock);
-   locks_free_lock(fl);
-   goto out_free_fasync;
-   }
-   if (ret != fl)
-   locks_free_lock(fl);
+   if (error)
+   goto out_unlock;
+   if (ret == fl)
+   fl = NULL;
 
/*
 * fasync_insert_entry() returns the old entry if any.
@@ -1779,9 +1776,10 @@ static int do_fcntl_add_lease(unsigned int fd, struct 
file *filp, long arg)
new = NULL;
 
error = __f_setown(filp, task_pid(current), PIDTYPE_PID, 0);
+out_unlock:
spin_unlock(inode-i_lock);
-
-out_free_fasync:
+   if (fl)
+   locks_free_lock(fl);
if (new)
fasync_free(new);
return error;
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    5   6   7   8   9   10   11   12   13   14   >