Re: [Ecryptfs-devel] [PATCH 3/11] eCryptfs: read_write.c routines

2007-09-25 Thread Michael Halcrow
On Fri, Sep 21, 2007 at 03:05:40PM -0700, Andrew Morton wrote:
 btw, I'm not really a great admirer of the whole patchset: it does
 some pretty nasty-looking things: allocating dynamic memory,
 grabbing the underlying pageframes with virt_to_page(), passing them
 back into kernel APIs which are supposed to be called from
 userspace, etc.  It's all rather ugly and abusive-looking.

Functions higher up the execution stack should be the ones mucking
with the Uptodate flag. The patch below addresses some of these
issues. I also whipped up a post-patch partial call graph to help
illustrate what is going on with the page mapping and Uptodate status
in the various eCryptfs read/write paths:

http://ecryptfs.sourceforge.net/ecryptfs-pageuptodate-call-graph.png

---

The functions that eventually call down to ecryptfs_read_lower(),
ecryptfs_decrypt_page(), and ecryptfs_copy_up_encrypted_with_header()
should have the responsibility of managing the page Uptodate
status. This patch gets rid of some of the ugliness that resulted from
trying to push some of the page flag setting too far down the stack.

Signed-off-by: Michael Halcrow [EMAIL PROTECTED]
---
diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index b3795f6..bbec711 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -605,14 +605,14 @@ int ecryptfs_decrypt_page(struct page *page)
printk(KERN_ERR %s: Error attempting to copy 
   page at index [%ld]\n, __FUNCTION__,
   page-index);
-   goto out_clear_uptodate;
+   goto out;
}
enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER);
if (!enc_extent_virt) {
rc = -ENOMEM;
ecryptfs_printk(KERN_ERR, Error allocating memory for 
encrypted extent\n);
-   goto out_clear_uptodate;
+   goto out;
}
enc_extent_page = virt_to_page(enc_extent_virt);
for (extent_offset = 0;
@@ -631,21 +631,17 @@ int ecryptfs_decrypt_page(struct page *page)
ecryptfs_printk(KERN_ERR, Error attempting 
to read lower page; rc = [%d]
\n, rc);
-   goto out_clear_uptodate;
+   goto out;
}
rc = ecryptfs_decrypt_extent(page, crypt_stat, enc_extent_page,
 extent_offset);
if (rc) {
printk(KERN_ERR %s: Error encrypting extent; 
   rc = [%d]\n, __FUNCTION__, rc);
-   goto out_clear_uptodate;
+   goto out;
}
extent_offset++;
}
-   SetPageUptodate(page);
-   goto out;
-out_clear_uptodate:
-   ClearPageUptodate(page);
 out:
kfree(enc_extent_virt);
return rc;
diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h
index bb92b74..ce7a5d4 100644
--- a/fs/ecryptfs/ecryptfs_kernel.h
+++ b/fs/ecryptfs/ecryptfs_kernel.h
@@ -648,6 +648,6 @@ int ecryptfs_read_lower_page_segment(struct page 
*page_for_ecryptfs,
 struct inode *ecryptfs_inode);
 int ecryptfs_read(char *data, loff_t offset, size_t size,
  struct file *ecryptfs_file);
-struct page *ecryptfs_get1page(struct file *file, loff_t index);
+struct page *ecryptfs_get_locked_page(struct file *file, loff_t index);
 
 #endif /* #ifndef ECRYPTFS_KERNEL_H */
diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c
index 4eb09c1..16a7a55 100644
--- a/fs/ecryptfs/mmap.c
+++ b/fs/ecryptfs/mmap.c
@@ -37,23 +37,27 @@
 struct kmem_cache *ecryptfs_lower_page_cache;
 
 /**
- * ecryptfs_get1page
+ * ecryptfs_get_locked_page
  *
  * Get one page from cache or lower f/s, return error otherwise.
  *
- * Returns unlocked and up-to-date page (if ok), with increased
+ * Returns locked and up-to-date page (if ok), with increased
  * refcnt.
  */
-struct page *ecryptfs_get1page(struct file *file, loff_t index)
+struct page *ecryptfs_get_locked_page(struct file *file, loff_t index)
 {
struct dentry *dentry;
struct inode *inode;
struct address_space *mapping;
+   struct page *page;
 
dentry = file-f_path.dentry;
inode = dentry-d_inode;
mapping = inode-i_mapping;
-   return read_mapping_page(mapping, index, (void *)file);
+   page = read_mapping_page(mapping, index, (void *)file);
+   if (!IS_ERR(page))
+   lock_page(page);
+   return page;
 }
 
 /**
@@ -146,12 +150,10 @@ ecryptfs_copy_up_encrypted_with_header(struct page *page,
kunmap_atomic(page_virt, KM_USER0);
flush_dcache_page(page);
if (rc) {
-   ClearPageUptodate(page);
 

[patch 0/4] 64k pagesize/blocksize fixes

2007-09-25 Thread Christoph Lameter
Attached the fixes necessary to support 64k pagesize/blocksize. I think these 
are useful
independent of the large blocksize patchset since there are architectures that 
support
64k page size and that could use these large buffer sizes without the large 
buffersize
patchset.

Are these patches in the right shape to be merged? I rediffed these against 
2.6.32-rc8-mm1.

I had to fix some things in the second patch (ext2) that may need some review 
since the
way that commits work changed.

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[patch 1/4] Increase limits for 64k page size support for Ext2/3/4

2007-09-25 Thread Christoph Lameter
[This patch allows architectures that use 64k blocksizes--like IA64 and
PPC64--to use 64k blocks on ext filesystems]

The patches to support blocksizes up to PAGESIZE, max 64KB for ext2/3/4,\
were originally from Takashi Sato.
http://marc.info/?l=linux-ext4m=115768873518400w=2

It's quite simple to support large block size in ext2/3/4, mostly just
enlarge the block size limit.  But it is NOT possible to have 64kB
blocksize on ext2/3/4 without some changes to the directory handling
code.  The reason is that an empty 64kB directory block would have a
rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in
the filesystem.  The proposed solution is to put 2 empty records in such
a directory, or to special-case an impossible value like rec_len =
0x to handle this.

Signed-off-by: Takashi Sato [EMAIL PROTECTED]
Signed-off-by: Mingming Cao [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/ext2/super.c |2 +-
 fs/ext3/super.c |5 -
 fs/ext4/super.c |5 +
 include/linux/ext2_fs.h |4 ++--
 include/linux/ext3_fs.h |4 ++--
 include/linux/ext4_fs.h |4 ++--
 6 files changed, 16 insertions(+), 8 deletions(-)

Index: linux-2.6.23-rc8-mm1/fs/ext2/super.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext2/super.c   2007-09-25 14:53:57.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext2/super.c2007-09-25 15:37:34.0 
-0700
@@ -856,7 +856,7 @@ static int ext2_fill_super(struct super_
brelse(bh);
 
if (!sb_set_blocksize(sb, blocksize)) {
-   printk(KERN_ERR EXT2-fs: blocksize too small for 
device.\n);
+   printk(KERN_ERR EXT2-fs: bad blocksize %d.\n, 
blocksize);
goto failed_sbi;
}
 
Index: linux-2.6.23-rc8-mm1/fs/ext3/super.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext3/super.c   2007-09-25 14:53:57.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext3/super.c2007-09-25 15:37:34.0 
-0700
@@ -1625,7 +1625,10 @@ static int ext3_fill_super (struct super
}
 
brelse (bh);
-   sb_set_blocksize(sb, blocksize);
+   if (!sb_set_blocksize(sb, blocksize)) {
+   printk(KERN_ERR EXT3-fs: bad blocksize %d.\n, 
blocksize);
+   goto out_fail;
+   }
logic_sb_block = (sb_block * EXT3_MIN_BLOCK_SIZE) / blocksize;
offset = (sb_block * EXT3_MIN_BLOCK_SIZE) % blocksize;
bh = sb_bread(sb, logic_sb_block);
Index: linux-2.6.23-rc8-mm1/fs/ext4/super.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext4/super.c   2007-09-25 14:53:57.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext4/super.c2007-09-25 15:37:34.0 
-0700
@@ -1652,6 +1652,11 @@ static int ext4_fill_super (struct super
goto out_fail;
}
 
+   if (!sb_set_blocksize(sb, blocksize)) {
+   printk(KERN_ERR EXT4-fs: bad blocksize %d.\n, blocksize);
+   goto out_fail;
+   }
+
/*
 * The ext4 superblock will not be buffer aligned for other than 1kB
 * block sizes.  We need to calculate the offset from buffer start.
Index: linux-2.6.23-rc8-mm1/include/linux/ext2_fs.h
===
--- linux-2.6.23-rc8-mm1.orig/include/linux/ext2_fs.h   2007-09-25 
14:53:58.0 -0700
+++ linux-2.6.23-rc8-mm1/include/linux/ext2_fs.h2007-09-25 
15:37:34.0 -0700
@@ -87,8 +87,8 @@ static inline struct ext2_sb_info *EXT2_
  * Macro-instructions used to manage several block sizes
  */
 #define EXT2_MIN_BLOCK_SIZE1024
-#defineEXT2_MAX_BLOCK_SIZE 4096
-#define EXT2_MIN_BLOCK_LOG_SIZE  10
+#define EXT2_MAX_BLOCK_SIZE65536
+#define EXT2_MIN_BLOCK_LOG_SIZE10
 #ifdef __KERNEL__
 # define EXT2_BLOCK_SIZE(s)((s)-s_blocksize)
 #else
Index: linux-2.6.23-rc8-mm1/include/linux/ext3_fs.h
===
--- linux-2.6.23-rc8-mm1.orig/include/linux/ext3_fs.h   2007-09-24 
17:33:10.0 -0700
+++ linux-2.6.23-rc8-mm1/include/linux/ext3_fs.h2007-09-25 
15:37:34.0 -0700
@@ -76,8 +76,8 @@
  * Macro-instructions used to manage several block sizes
  */
 #define EXT3_MIN_BLOCK_SIZE1024
-#defineEXT3_MAX_BLOCK_SIZE 4096
-#define EXT3_MIN_BLOCK_LOG_SIZE  10
+#defineEXT3_MAX_BLOCK_SIZE 65536
+#define EXT3_MIN_BLOCK_LOG_SIZE10
 #ifdef __KERNEL__
 # define EXT3_BLOCK_SIZE(s)((s)-s_blocksize)
 #else
Index: linux-2.6.23-rc8-mm1/include/linux/ext4_fs.h

[patch 3/4] ext3: fix rec_len overflow with 64KB block size

2007-09-25 Thread Christoph Lameter
Prevent rec_len from overflow with 64KB blocksize

Signed-off-by: Takashi Sato [EMAIL PROTECTED]
Signed-off-by: Mingming Cao [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/ext3/dir.c   |   13 ---
 fs/ext3/namei.c |   88 +++-
 include/linux/ext3_fs.h |9 
 3 files changed, 91 insertions(+), 19 deletions(-)

Index: linux-2.6.23-rc8-mm1/fs/ext3/dir.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext3/dir.c 2007-09-25 14:53:57.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext3/dir.c  2007-09-25 15:41:45.0 -0700
@@ -100,10 +100,11 @@ static int ext3_readdir(struct file * fi
unsigned long offset;
int i, stored;
struct ext3_dir_entry_2 *de;
-   struct super_block *sb;
int err;
struct inode *inode = filp-f_path.dentry-d_inode;
int ret = 0;
+   struct super_block *sb = inode-i_sb;
+   unsigned tail = sb-s_blocksize;
 
sb = inode-i_sb;
 
@@ -167,8 +168,11 @@ revalidate:
 * readdir(2), then we might be pointing to an invalid
 * dirent right now.  Scan from the start of the block
 * to make sure. */
-   if (filp-f_version != inode-i_version) {
-   for (i = 0; i  sb-s_blocksize  i  offset; ) {
+   if (tail   EXT3_DIR_MAX_REC_LEN)
+   tail = EXT3_DIR_MAX_REC_LEN;
+
+   if (filp-f_version != inode-i_version) {
+   for (i = 0; i  tail  i  offset; ) {
de = (struct ext3_dir_entry_2 *)
(bh-b_data + i);
/* It's too expensive to do a full
@@ -189,7 +193,7 @@ revalidate:
}
 
while (!error  filp-f_pos  inode-i_size
-   offset  sb-s_blocksize) {
+   offset  tail) {
de = (struct ext3_dir_entry_2 *) (bh-b_data + offset);
if (!ext3_check_dir_entry (ext3_readdir, inode, de,
   bh, offset)) {
@@ -225,6 +229,7 @@ revalidate:
}
filp-f_pos += le16_to_cpu(de-rec_len);
}
+   filp-f_pos = EXT3_DIR_ADJUST_TAIL_OFFS(filp-f_pos, 
sb-s_blocksize);
offset = 0;
brelse (bh);
}
Index: linux-2.6.23-rc8-mm1/fs/ext3/namei.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext3/namei.c   2007-09-24 17:33:10.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext3/namei.c2007-09-25 15:41:45.0 
-0700
@@ -263,9 +263,13 @@ static struct stats dx_show_leaf(struct 
unsigned names = 0, space = 0;
char *base = (char *) de;
struct dx_hash_info h = *hinfo;
+   unsigned tail = size;
 
printk(names: );
-   while ((char *) de  base + size)
+   if (tail  EXT3_DIR_MAX_REC_LEN)
+   tail = EXT3_DIR_MAX_REC_LEN;
+
+   while ((char *) de  base + tail)
{
if (de-inode)
{
@@ -708,8 +712,12 @@ static int dx_make_map (struct ext3_dir_
int count = 0;
char *base = (char *) de;
struct dx_hash_info h = *hinfo;
+   unsigned tail = size;
+
+   if (tail  EXT3_DIR_MAX_REC_LEN)
+   tail = EXT3_DIR_MAX_REC_LEN;
 
-   while ((char *) de  base + size)
+   while ((char *) de  base + tail)
{
if (de-name_len  de-inode) {
ext3fs_dirhash(de-name, de-name_len, h);
@@ -808,9 +816,13 @@ static inline int search_dirblock(struct
int de_len;
const char *name = dentry-d_name.name;
int namelen = dentry-d_name.len;
+   unsigned tail = dir-i_sb-s_blocksize;
 
de = (struct ext3_dir_entry_2 *) bh-b_data;
-   dlimit = bh-b_data + dir-i_sb-s_blocksize;
+   if (tail  EXT3_DIR_MAX_REC_LEN)
+   tail = EXT3_DIR_MAX_REC_LEN;
+
+   dlimit = bh-b_data + tail;
while ((char *) de  dlimit) {
/* this code is executed quadratically often */
/* do minimal checking `by hand' */
@@ -1156,6 +1168,9 @@ static struct ext3_dir_entry_2* dx_pack_
unsigned rec_len = 0;
 
prev = to = de;
+   if (size  EXT3_DIR_MAX_REC_LEN)
+   size = EXT3_DIR_MAX_REC_LEN;
+
while ((char*)de  base + size) {
next = (struct ext3_dir_entry_2 *) ((char *) de +
le16_to_cpu(de-rec_len));
@@ -1237,8 +1252,15 @@ static struct ext3_dir_entry_2 *do_split
/* Fancy dance to stay within two buffers */
de2 = dx_move_dirents(data1, data2, map + split, count - split);
de = dx_pack_dirents(data1,blocksize);
-   de-rec_len 

[patch 2/4] ext2: fix rec_len overflow for 64KB block size

2007-09-25 Thread Christoph Lameter
[2/4]  ext2: fix rec_len overflow
 - prevent rec_len from overflow with 64KB blocksize

Signed-off-by: Takashi Sato [EMAIL PROTECTED]
Signed-off-by: Mingming Cao [EMAIL PROTECTED]
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/ext2/dir.c   |   46 --
 include/linux/ext2_fs.h |   13 +
 2 files changed, 49 insertions(+), 10 deletions(-)

Index: linux-2.6.23-rc8-mm1/fs/ext2/dir.c
===
--- linux-2.6.23-rc8-mm1.orig/fs/ext2/dir.c 2007-09-25 15:59:34.0 
-0700
+++ linux-2.6.23-rc8-mm1/fs/ext2/dir.c  2007-09-25 16:02:51.0 -0700
@@ -105,9 +105,9 @@ static void ext2_check_page(struct page 
goto out;
}
for (offs = 0; offs = limit - EXT2_DIR_REC_LEN(1); offs += rec_len) {
+   offs = EXT2_DIR_ADJUST_TAIL_OFFS(offs, chunk_size);
p = (ext2_dirent *)(kaddr + offs);
rec_len = le16_to_cpu(p-rec_len);
-
if (rec_len  EXT2_DIR_REC_LEN(1))
goto Eshort;
if (rec_len  3)
@@ -119,6 +119,7 @@ static void ext2_check_page(struct page 
if (le32_to_cpu(p-inode)  max_inumber)
goto Einumber;
}
+   offs = EXT2_DIR_ADJUST_TAIL_OFFS(offs, chunk_size);
if (offs != limit)
goto Eend;
 out:
@@ -294,6 +295,7 @@ ext2_readdir (struct file * filp, void *
de = (ext2_dirent *)(kaddr+offset);
limit = kaddr + ext2_last_byte(inode, n) - EXT2_DIR_REC_LEN(1);
for ( ;(char*)de = limit; de = ext2_next_entry(de)) {
+   de = EXT2_DIR_ADJUST_TAIL_ADDR(kaddr, de, 
sb-s_blocksize);
if (de-rec_len == 0) {
ext2_error(sb, __FUNCTION__,
zero-length directory entry);
@@ -316,8 +318,10 @@ ext2_readdir (struct file * filp, void *
return 0;
}
}
+   filp-f_pos = EXT2_DIR_ADJUST_TAIL_OFFS(filp-f_pos, 
sb-s_blocksize);
filp-f_pos += le16_to_cpu(de-rec_len);
}
+   filp-f_pos = EXT2_DIR_ADJUST_TAIL_OFFS(filp-f_pos, 
sb-s_blocksize);
ext2_put_page(page);
}
return 0;
@@ -354,13 +358,14 @@ struct ext2_dir_entry_2 * ext2_find_entr
start = 0;
n = start;
do {
-   char *kaddr;
+   char *kaddr, *page_start;
page = ext2_get_page(dir, n);
if (!IS_ERR(page)) {
-   kaddr = page_address(page);
+   kaddr = page_start = page_address(page);
de = (ext2_dirent *) kaddr;
kaddr += ext2_last_byte(dir, n) - reclen;
while ((char *) de = kaddr) {
+   de = EXT2_DIR_ADJUST_TAIL_ADDR(page_start, de, 
dir-i_sb-s_blocksize);
if (de-rec_len == 0) {
ext2_error(dir-i_sb, __FUNCTION__,
zero-length directory entry);
@@ -428,6 +433,7 @@ void ext2_set_link(struct inode *dir, st
unsigned len = le16_to_cpu(de-rec_len);
int err;
 
+   len = EXT2_DIR_ADJUST_TAIL_OFFS(pos, len);
lock_page(page);
err = __ext2_write_begin(NULL, page-mapping, pos, len,
AOP_FLAG_UNINTERRUPTIBLE, page, NULL);
@@ -459,6 +465,7 @@ int ext2_add_link (struct dentry *dentry
char *kaddr;
loff_t pos;
int err;
+   char *page_start = NULL;
 
/*
 * We take care of directory expansion in the same loop.
@@ -473,16 +480,29 @@ int ext2_add_link (struct dentry *dentry
if (IS_ERR(page))
goto out;
lock_page(page);
-   kaddr = page_address(page);
+   kaddr = page_start = page_address(page);
dir_end = kaddr + ext2_last_byte(dir, n);
de = (ext2_dirent *)kaddr;
-   kaddr += PAGE_CACHE_SIZE - reclen;
+   if (chunk_size  EXT2_DIR_MAX_REC_LEN)
+   kaddr += PAGE_CACHE_SIZE - reclen;
+   else
+   kaddr += PAGE_CACHE_SIZE -
+   (chunk_size - EXT2_DIR_MAX_REC_LEN) - reclen;
+
while ((char *)de = kaddr) {
+   de = EXT2_DIR_ADJUST_TAIL_ADDR(page_start, de, 
chunk_size);
if ((char *)de == dir_end) {
/* We hit i_size */
name_len = 0;
-   rec_len = chunk_size;
-   de-rec_len = 

[00/17] Virtual Compound Page Support V1

2007-09-25 Thread Christoph Lameter
RFC-V1
- Support for all compound functions for virtual compound pages
  (including the compound_nth_page() necessary for LBS mmap support)
- Fix various bugs
- Fix i386 build

Currently there is a strong tendency to avoid larger page allocations in
the kernel because of past fragmentation issues and the current
defragmentation methods are still evolving. It is not clear to what extend
they can provide reliable allocations for higher order pages (plus the
definition of reliable seems to be in the eye of the beholder).

We use vmalloc allocations in many locations to provide a safe
way to allocate larger arrays. That is due to the danger of higher order
allocations failing. Virtual Compound pages allow the use of regular
page allocator allocations that will fall back only if there is an actual
problem with acquiring a higher order page.

This patch set provides a way for a higher page allocation to fall back.
Instead of a physically contiguous page a virtually contiguous page
is provided. The functionality of the vmalloc layer is used to provide
the necessary page tables and control structures to establish a virtually
contiguous area.

Advantages:

- If higher order allocations are failing then virtual compound pages
  consisting of a series of order-0 pages can stand in for those
  allocations.

- Reliability as long as the vmalloc layer can provide virtual mappings.

- Ability to reduce the use of vmalloc layer significantly by using
  physically contiguous memory instead of virtual contiguous memory.
  Most uses of vmalloc() can be converted to page allocator calls.

- The use of physically contiguous memory instead of vmalloc may allow the
  use larger TLB entries thus reducing TLB pressure. Also reduces the need
  for page table walks.

Disadvantages:

- In order to use fall back the logic accessing the memory must be
  aware that the memory could be backed by a virtual mapping and take
  precautions. virt_to_page() and page_address() may not work and
  vmalloc_to_page() and vmalloc_address() (introduced through this
  patch set) may have to be called.

- Virtual mappings are less efficient than physical mappings.
  Performance will drop once virtual fall back occurs.

- Virtual mappings have more memory overhead. vm_area control structures
  page tables, page arrays etc need to be allocated and managed to provide
  virtual mappings.

The patchset provides this functionality in stages. Stage 1 introduces
the basic fall back mechanism necessary to replace vmalloc allocations
with

alloc_page(GFP_VFALLBACK, order, )

which signifies to the page allocator that a higher order is to be found
but a virtual mapping may stand in if there is an issue with fragmentation.

Stage 1 functionality does not allow allocation and freeing of virtual
mappings from interrupt contexts.

The stage 1 series ends with the conversion of a few key uses of vmalloc
in the VM to alloc_pages() for the allocation of sparsemems memmap table
and the wait table in each zone. Other uses of vmalloc could be converted
in the same way.


Stage 2 functionality enhances the fallback even more allowing allocation
and frees in interrupt context.

SLUB is then modified to use the virtual mappings for slab caches
that are marked with SLAB_VFALLBACK. If a slab cache is marked this way
then we drop all the restraints regarding page order and allocate
good large memory areas that fit lots of objects so that we rarely
have to use the slow paths.

Two slab caches--the dentry cache and the buffer_heads--are then flagged
that way. Others could be converted in the same way.

The patch set also provides a debugging aid through setting

CONFIG_VFALLBACK_ALWAYS

If set then all GFP_VFALLBACK allocations fall back to the virtual
mappings. This is useful for verification tests. The test of this
patch set was done by enabling that options and compiling a kernel.


The patch set is also available via git from the largeblock git tree via

git pull
  git://git.kernel.org/pub/scm/linux/kernel/git/christoph/largeblocksize.git
vcompound

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[03/17] i386: Resolve dependency of asm-i386/pgtable.h on highmem.h

2007-09-25 Thread Christoph Lameter
pgtable.h does not include highmem.h but uses various constants from
highmem.h. We cannot include highmem.h because highmem.h will in turn
include many other include files that also depend on pgtable.h

So move the definitions from highmem.h into pgtable.h.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/asm-i386/highmem.h |6 --
 include/asm-i386/pgtable.h |8 
 2 files changed, 8 insertions(+), 6 deletions(-)

Index: linux-2.6/include/asm-i386/highmem.h
===
--- linux-2.6.orig/include/asm-i386/highmem.h   2007-09-20 23:54:57.0 
-0700
+++ linux-2.6/include/asm-i386/highmem.h2007-09-20 23:55:40.0 
-0700
@@ -38,11 +38,6 @@ extern pte_t *pkmap_page_table;
  * easily, subsequent pte tables have to be allocated in one physical
  * chunk of RAM.
  */
-#ifdef CONFIG_X86_PAE
-#define LAST_PKMAP 512
-#else
-#define LAST_PKMAP 1024
-#endif
 /*
  * Ordering is:
  *
@@ -58,7 +53,6 @@ extern pte_t *pkmap_page_table;
  * VMALLOC_START
  * high_memory
  */
-#define PKMAP_BASE ( (FIXADDR_BOOT_START - PAGE_SIZE*(LAST_PKMAP + 1))  
PMD_MASK )
 #define LAST_PKMAP_MASK (LAST_PKMAP-1)
 #define PKMAP_NR(virt)  ((virt-PKMAP_BASE)  PAGE_SHIFT)
 #define PKMAP_ADDR(nr)  (PKMAP_BASE + ((nr)  PAGE_SHIFT))
Index: linux-2.6/include/asm-i386/pgtable.h
===
--- linux-2.6.orig/include/asm-i386/pgtable.h   2007-09-20 23:55:16.0 
-0700
+++ linux-2.6/include/asm-i386/pgtable.h2007-09-20 23:56:21.0 
-0700
@@ -81,6 +81,14 @@ void paging_init(void);
 #define VMALLOC_OFFSET (8*1024*1024)
 #define VMALLOC_START  (((unsigned long) high_memory + \
2*VMALLOC_OFFSET-1)  ~(VMALLOC_OFFSET-1))
+#ifdef CONFIG_X86_PAE
+#define LAST_PKMAP 512
+#else
+#define LAST_PKMAP 1024
+#endif
+
+#define PKMAP_BASE ( (FIXADDR_BOOT_START - PAGE_SIZE*(LAST_PKMAP + 1))  
PMD_MASK )
+
 #ifdef CONFIG_HIGHMEM
 # define VMALLOC_END   (PKMAP_BASE-2*PAGE_SIZE)
 #else

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[01/17] Vmalloc: Move vmalloc_to_page to mm/vmalloc.

2007-09-25 Thread Christoph Lameter
We already have page table manipulation for vmalloc in vmalloc.c. Move the
vmalloc_to_page() function there as well.

Move the definitions for vmalloc related functions in mm.h to before the
functions dealing with compound pages because they will soon need to use
them.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/mm.h |5 +++--
 mm/memory.c|   40 
 mm/vmalloc.c   |   38 ++
 3 files changed, 41 insertions(+), 42 deletions(-)

Index: linux-2.6/mm/memory.c
===
--- linux-2.6.orig/mm/memory.c  2007-09-24 16:55:28.0 -0700
+++ linux-2.6/mm/memory.c   2007-09-24 16:55:32.0 -0700
@@ -2727,46 +2727,6 @@ int make_pages_present(unsigned long add
return ret == len ? 0 : -1;
 }
 
-/* 
- * Map a vmalloc()-space virtual address to the physical page.
- */
-struct page * vmalloc_to_page(void * vmalloc_addr)
-{
-   unsigned long addr = (unsigned long) vmalloc_addr;
-   struct page *page = NULL;
-   pgd_t *pgd = pgd_offset_k(addr);
-   pud_t *pud;
-   pmd_t *pmd;
-   pte_t *ptep, pte;
-  
-   if (!pgd_none(*pgd)) {
-   pud = pud_offset(pgd, addr);
-   if (!pud_none(*pud)) {
-   pmd = pmd_offset(pud, addr);
-   if (!pmd_none(*pmd)) {
-   ptep = pte_offset_map(pmd, addr);
-   pte = *ptep;
-   if (pte_present(pte))
-   page = pte_page(pte);
-   pte_unmap(ptep);
-   }
-   }
-   }
-   return page;
-}
-
-EXPORT_SYMBOL(vmalloc_to_page);
-
-/*
- * Map a vmalloc()-space virtual address to the physical page frame number.
- */
-unsigned long vmalloc_to_pfn(void * vmalloc_addr)
-{
-   return page_to_pfn(vmalloc_to_page(vmalloc_addr));
-}
-
-EXPORT_SYMBOL(vmalloc_to_pfn);
-
 #if !defined(__HAVE_ARCH_GATE_AREA)
 
 #if defined(AT_SYSINFO_EHDR)
Index: linux-2.6/mm/vmalloc.c
===
--- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:55:28.0 -0700
+++ linux-2.6/mm/vmalloc.c  2007-09-24 16:55:32.0 -0700
@@ -166,6 +166,44 @@ int map_vm_area(struct vm_struct *area, 
 }
 EXPORT_SYMBOL_GPL(map_vm_area);
 
+/*
+ * Map a vmalloc()-space virtual address to the physical page.
+ */
+struct page *vmalloc_to_page(void *vmalloc_addr)
+{
+   unsigned long addr = (unsigned long) vmalloc_addr;
+   struct page *page = NULL;
+   pgd_t *pgd = pgd_offset_k(addr);
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *ptep, pte;
+
+   if (!pgd_none(*pgd)) {
+   pud = pud_offset(pgd, addr);
+   if (!pud_none(*pud)) {
+   pmd = pmd_offset(pud, addr);
+   if (!pmd_none(*pmd)) {
+   ptep = pte_offset_map(pmd, addr);
+   pte = *ptep;
+   if (pte_present(pte))
+   page = pte_page(pte);
+   pte_unmap(ptep);
+   }
+   }
+   }
+   return page;
+}
+EXPORT_SYMBOL(vmalloc_to_page);
+
+/*
+ * Map a vmalloc()-space virtual address to the physical page frame number.
+ */
+unsigned long vmalloc_to_pfn(void *vmalloc_addr)
+{
+   return page_to_pfn(vmalloc_to_page(vmalloc_addr));
+}
+EXPORT_SYMBOL(vmalloc_to_pfn);
+
 static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long 
flags,
unsigned long start, unsigned long 
end,
int node, gfp_t gfp_mask)
Index: linux-2.6/include/linux/mm.h
===
--- linux-2.6.orig/include/linux/mm.h   2007-09-24 16:55:28.0 -0700
+++ linux-2.6/include/linux/mm.h2007-09-24 16:57:23.0 -0700
@@ -294,6 +294,9 @@ static inline int get_page_unless_zero(s
return atomic_inc_not_zero(page-_count);
 }
 
+struct page *vmalloc_to_page(void *addr);
+unsigned long vmalloc_to_pfn(void *addr);
+
 static inline struct page *compound_head(struct page *page)
 {
if (unlikely(PageTail(page)))
@@ -1160,8 +1163,6 @@ static inline unsigned long vma_pages(st
 
 pgprot_t vm_get_page_prot(unsigned long vm_flags);
 struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr);
-struct page *vmalloc_to_page(void *addr);
-unsigned long vmalloc_to_pfn(void *addr);
 int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
 int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *);

-- 
-
To unsubscribe from this list: send the line 

[05/17] vmalloc: clean up page array indexing

2007-09-25 Thread Christoph Lameter
The page array is repeatedly indexed both in vunmap and vmalloc_area_node().
Add a temporary variable to make it easier to read (and easier to patch
later).

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 mm/vmalloc.c |   16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

Index: linux-2.6/mm/vmalloc.c
===
--- linux-2.6.orig/mm/vmalloc.c 2007-09-18 13:22:16.0 -0700
+++ linux-2.6/mm/vmalloc.c  2007-09-18 13:22:17.0 -0700
@@ -383,8 +383,10 @@ static void __vunmap(const void *addr, i
int i;
 
for (i = 0; i  area-nr_pages; i++) {
-   BUG_ON(!area-pages[i]);
-   __free_page(area-pages[i]);
+   struct page *page = area-pages[i];
+
+   BUG_ON(!page);
+   __free_page(page);
}
 
if (area-flags  VM_VPAGES)
@@ -488,15 +490,19 @@ void *__vmalloc_area_node(struct vm_stru
}
 
for (i = 0; i  area-nr_pages; i++) {
+   struct page *page;
+
if (node  0)
-   area-pages[i] = alloc_page(gfp_mask);
+   page = alloc_page(gfp_mask);
else
-   area-pages[i] = alloc_pages_node(node, gfp_mask, 0);
-   if (unlikely(!area-pages[i])) {
+   page = alloc_pages_node(node, gfp_mask, 0);
+
+   if (unlikely(!page)) {
/* Successfully allocated i pages, free them in 
__vunmap() */
area-nr_pages = i;
goto fail;
}
+   area-pages[i] = page;
}
 
if (map_vm_area(area, prot, pages))

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[06/17] vunmap: return page array passed on vmap()

2007-09-25 Thread Christoph Lameter
Make vunmap return the page array that was used at vmap. This is useful
if one has no structures to track the page array but simply stores the
virtual address somewhere. The disposition of the page array can be
decided upon after vunmap. vfree() may now also be used instead of
vunmap which will release the page array after vunmap'ping it.

As noted by Kamezawa: The same subsystem that provides the page array
to vmap must must use its own method to dispose of the page array.

If vfree() is called to free the page array then the page array must either
be

1. Allocated via the slab allocator

2. Allocated via vmalloc but then VM_VPAGES must have been passed at
   vunmap to specify that a vfree is needed.

RFC-v1:
- Add comment explaining how to use vfree() to dispose of the page
  array passed on vmap().

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/vmalloc.h |2 +-
 mm/vmalloc.c|   33 +++--
 2 files changed, 24 insertions(+), 11 deletions(-)

Index: linux-2.6/include/linux/vmalloc.h
===
--- linux-2.6.orig/include/linux/vmalloc.h  2007-09-24 15:52:53.0 
-0700
+++ linux-2.6/include/linux/vmalloc.h   2007-09-24 15:59:15.0 -0700
@@ -49,7 +49,7 @@ extern void vfree(const void *addr);
 
 extern void *vmap(struct page **pages, unsigned int count,
unsigned long flags, pgprot_t prot);
-extern void vunmap(const void *addr);
+extern struct page **vunmap(const void *addr);
 
 extern int remap_vmalloc_range(struct vm_area_struct *vma, void *addr,
unsigned long pgoff);
Index: linux-2.6/mm/vmalloc.c
===
--- linux-2.6.orig/mm/vmalloc.c 2007-09-24 15:56:49.0 -0700
+++ linux-2.6/mm/vmalloc.c  2007-09-24 16:02:10.0 -0700
@@ -356,17 +356,18 @@ struct vm_struct *remove_vm_area(const v
return v;
 }
 
-static void __vunmap(const void *addr, int deallocate_pages)
+static struct page **__vunmap(const void *addr, int deallocate_pages)
 {
struct vm_struct *area;
+   struct page **pages;
 
if (!addr)
-   return;
+   return NULL;
 
if ((PAGE_SIZE-1)  (unsigned long)addr) {
printk(KERN_ERR Trying to vfree() bad address (%p)\n, addr);
WARN_ON(1);
-   return;
+   return NULL;
}
 
area = remove_vm_area(addr);
@@ -374,29 +375,30 @@ static void __vunmap(const void *addr, i
printk(KERN_ERR Trying to vfree() nonexistent vm area (%p)\n,
addr);
WARN_ON(1);
-   return;
+   return NULL;
}
 
+   pages = area-pages;
debug_check_no_locks_freed(addr, area-size);
 
if (deallocate_pages) {
int i;
 
for (i = 0; i  area-nr_pages; i++) {
-   struct page *page = area-pages[i];
+   struct page *page = pages[i];
 
BUG_ON(!page);
__free_page(page);
}
 
if (area-flags  VM_VPAGES)
-   vfree(area-pages);
+   vfree(pages);
else
-   kfree(area-pages);
+   kfree(pages);
}
 
kfree(area);
-   return;
+   return pages;
 }
 
 /**
@@ -424,11 +426,13 @@ EXPORT_SYMBOL(vfree);
  * which was created from the page array passed to vmap().
  *
  * Must not be called in interrupt context.
+ *
+ * Returns a pointer to the array of pointers to page structs
  */
-void vunmap(const void *addr)
+struct page **vunmap(const void *addr)
 {
BUG_ON(in_interrupt());
-   __vunmap(addr, 0);
+   return __vunmap(addr, 0);
 }
 EXPORT_SYMBOL(vunmap);
 
@@ -441,6 +445,13 @@ EXPORT_SYMBOL(vunmap);
  *
  * Maps @count pages from @pages into contiguous kernel virtual
  * space.
+ *
+ * The page array may be freed via vfree() on the virtual address
+ * returned. In that case the page array must be allocated via
+ * the slab allocator. If the page array was allocated via
+ * vmalloc then VM_VPAGES must be specified in the flags. There is
+ * no support for vfree() to free a page array allocated via the
+ * page allocator.
  */
 void *vmap(struct page **pages, unsigned int count,
unsigned long flags, pgprot_t prot)
@@ -453,6 +464,8 @@ void *vmap(struct page **pages, unsigned
area = get_vm_area((count  PAGE_SHIFT), flags);
if (!area)
return NULL;
+   area-pages = pages;
+   area-nr_pages = count;
if (map_vm_area(area, prot, pages)) {
vunmap(area-addr);
return NULL;

-- 
-
To unsubscribe from this list: send the line unsubscribe 

[04/17] is_vmalloc_addr(): Check if an address is within the vmalloc boundaries

2007-09-25 Thread Christoph Lameter
is_vmalloc_addr() is used in a couple of places. Add a version to vmalloc.h
and replace the other checks.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 drivers/net/cxgb3/cxgb3_offload.c |4 +---
 fs/ntfs/malloc.h  |3 +--
 fs/proc/kcore.c   |2 +-
 fs/xfs/linux-2.6/kmem.c   |3 +--
 fs/xfs/linux-2.6/xfs_buf.c|3 +--
 include/linux/mm.h|8 
 mm/sparse.c   |   10 +-
 7 files changed, 14 insertions(+), 19 deletions(-)

Index: linux-2.6/include/linux/mm.h
===
--- linux-2.6.orig/include/linux/mm.h   2007-09-24 18:32:35.0 -0700
+++ linux-2.6/include/linux/mm.h2007-09-24 18:33:03.0 -0700
@@ -297,6 +297,14 @@ static inline int get_page_unless_zero(s
 struct page *vmalloc_to_page(const void *addr);
 unsigned long vmalloc_to_pfn(const void *addr);
 
+/* Determine if an address is within the vmalloc range */
+static inline int is_vmalloc_addr(const void *x)
+{
+   unsigned long addr = (unsigned long)x;
+
+   return addr = VMALLOC_START  addr  VMALLOC_END;
+}
+
 static inline struct page *compound_head(struct page *page)
 {
if (unlikely(PageTail(page)))
Index: linux-2.6/mm/sparse.c
===
--- linux-2.6.orig/mm/sparse.c  2007-09-24 18:30:46.0 -0700
+++ linux-2.6/mm/sparse.c   2007-09-24 18:33:03.0 -0700
@@ -289,17 +289,9 @@ got_map_ptr:
return ret;
 }
 
-static int vaddr_in_vmalloc_area(void *addr)
-{
-   if (addr = (void *)VMALLOC_START 
-   addr  (void *)VMALLOC_END)
-   return 1;
-   return 0;
-}
-
 static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages)
 {
-   if (vaddr_in_vmalloc_area(memmap))
+   if (is_vmalloc_addr(memmap))
vfree(memmap);
else
free_pages((unsigned long)memmap,
Index: linux-2.6/drivers/net/cxgb3/cxgb3_offload.c
===
--- linux-2.6.orig/drivers/net/cxgb3/cxgb3_offload.c2007-09-24 
18:30:46.0 -0700
+++ linux-2.6/drivers/net/cxgb3/cxgb3_offload.c 2007-09-24 18:33:03.0 
-0700
@@ -1035,9 +1035,7 @@ void *cxgb_alloc_mem(unsigned long size)
  */
 void cxgb_free_mem(void *addr)
 {
-   unsigned long p = (unsigned long)addr;
-
-   if (p = VMALLOC_START  p  VMALLOC_END)
+   if (is_vmalloc_addr(addr))
vfree(addr);
else
kfree(addr);
Index: linux-2.6/fs/ntfs/malloc.h
===
--- linux-2.6.orig/fs/ntfs/malloc.h 2007-09-24 18:30:46.0 -0700
+++ linux-2.6/fs/ntfs/malloc.h  2007-09-24 18:33:03.0 -0700
@@ -85,8 +85,7 @@ static inline void *ntfs_malloc_nofs_nof
 
 static inline void ntfs_free(void *addr)
 {
-   if (likely(((unsigned long)addr  VMALLOC_START) ||
-   ((unsigned long)addr = VMALLOC_END ))) {
+   if (!is_vmalloc_addr(addr)) {
kfree(addr);
/* free_page((unsigned long)addr); */
return;
Index: linux-2.6/fs/proc/kcore.c
===
--- linux-2.6.orig/fs/proc/kcore.c  2007-09-24 18:30:46.0 -0700
+++ linux-2.6/fs/proc/kcore.c   2007-09-24 18:33:03.0 -0700
@@ -325,7 +325,7 @@ read_kcore(struct file *file, char __use
if (m == NULL) {
if (clear_user(buffer, tsz))
return -EFAULT;
-   } else if ((start = VMALLOC_START)  (start  VMALLOC_END)) {
+   } else if (is_vmalloc_addr((void *)start)) {
char * elf_buf;
struct vm_struct *m;
unsigned long curstart = start;
Index: linux-2.6/fs/xfs/linux-2.6/kmem.c
===
--- linux-2.6.orig/fs/xfs/linux-2.6/kmem.c  2007-09-24 18:30:46.0 
-0700
+++ linux-2.6/fs/xfs/linux-2.6/kmem.c   2007-09-24 18:33:03.0 -0700
@@ -92,8 +92,7 @@ kmem_zalloc_greedy(size_t *size, size_t 
 void
 kmem_free(void *ptr, size_t size)
 {
-   if (((unsigned long)ptr  VMALLOC_START) ||
-   ((unsigned long)ptr = VMALLOC_END)) {
+   if (!is_vmalloc_addr(ptr)) {
kfree(ptr);
} else {
vfree(ptr);
Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c
===
--- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c   2007-09-24 18:30:46.0 
-0700
+++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c2007-09-24 18:33:03.0 
-0700
@@ -696,8 +696,7 @@ static inline struct page *
 mem_to_page(
void*addr)
 {
-   if (((unsigned long)addr  VMALLOC_START) ||
-   

[07/17] vmalloc_address(): Determine vmalloc address from page struct

2007-09-25 Thread Christoph Lameter
Sometimes we need to figure out which vmalloc address is in use
for a certain page struct. There is no easy way to figure out
the vmalloc address from the page struct. So simply search through
the kernel page tables to find the address. This is a fairly expensive
process. Use sparingly (or provide a better implementation).

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/mm.h |1 
 mm/vmalloc.c   |   77 +
 2 files changed, 78 insertions(+)

Index: linux-2.6/mm/vmalloc.c
===
--- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:59:54.0 -0700
+++ linux-2.6/mm/vmalloc.c  2007-09-24 17:00:07.0 -0700
@@ -196,6 +196,83 @@ struct page *vmalloc_to_page(const void 
 EXPORT_SYMBOL(vmalloc_to_page);
 
 /*
+ * Determine vmalloc address from a page struct.
+ *
+ * Linear search through all ptes of the vmalloc area.
+ */
+static unsigned long vaddr_pte_range(pmd_t *pmd, unsigned long addr,
+   unsigned long end, unsigned long pfn)
+{
+   pte_t *pte;
+
+   pte = pte_offset_kernel(pmd, addr);
+   do {
+   pte_t ptent = *pte;
+   if (pte_present(ptent)  pte_pfn(ptent) == pfn)
+   return addr;
+   } while (pte++, addr += PAGE_SIZE, addr != end);
+   return 0;
+}
+
+static inline unsigned long vaddr_pmd_range(pud_t *pud, unsigned long addr,
+   unsigned long end, unsigned long pfn)
+{
+   pmd_t *pmd;
+   unsigned long next;
+   unsigned long n;
+
+   pmd = pmd_offset(pud, addr);
+   do {
+   next = pmd_addr_end(addr, end);
+   if (pmd_none_or_clear_bad(pmd))
+   continue;
+   n = vaddr_pte_range(pmd, addr, next, pfn);
+   if (n)
+   return n;
+   } while (pmd++, addr = next, addr != end);
+   return 0;
+}
+
+static inline unsigned long vaddr_pud_range(pgd_t *pgd, unsigned long addr,
+   unsigned long end, unsigned long pfn)
+{
+   pud_t *pud;
+   unsigned long next;
+   unsigned long n;
+
+   pud = pud_offset(pgd, addr);
+   do {
+   next = pud_addr_end(addr, end);
+   if (pud_none_or_clear_bad(pud))
+   continue;
+   n = vaddr_pmd_range(pud, addr, next, pfn);
+   if (n)
+   return n;
+   } while (pud++, addr = next, addr != end);
+   return 0;
+}
+
+void *vmalloc_address(struct page *page)
+{
+   pgd_t *pgd;
+   unsigned long next, n;
+   unsigned long addr = VMALLOC_START;
+   unsigned long pfn = page_to_pfn(page);
+
+   pgd = pgd_offset_k(VMALLOC_START);
+   do {
+   next = pgd_addr_end(addr, VMALLOC_END);
+   if (pgd_none_or_clear_bad(pgd))
+   continue;
+   n = vaddr_pud_range(pgd, addr, next, pfn);
+   if (n)
+   return (void *)n;
+   } while (pgd++, addr = next, addr  VMALLOC_END);
+   return NULL;
+}
+EXPORT_SYMBOL(vmalloc_address);
+
+/*
  * Map a vmalloc()-space virtual address to the physical page frame number.
  */
 unsigned long vmalloc_to_pfn(const void *vmalloc_addr)
Index: linux-2.6/include/linux/mm.h
===
--- linux-2.6.orig/include/linux/mm.h   2007-09-24 17:00:33.0 -0700
+++ linux-2.6/include/linux/mm.h2007-09-24 17:00:42.0 -0700
@@ -296,6 +296,7 @@ static inline int get_page_unless_zero(s
 
 struct page *vmalloc_to_page(const void *addr);
 unsigned long vmalloc_to_pfn(const void *addr);
+void *vmalloc_address(struct page *);
 
 /* Determine if an address is within the vmalloc range */
 static inline int is_vmalloc_addr(const void *x)

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[11/17] GFP_VFALLBACK for zone wait table.

2007-09-25 Thread Christoph Lameter
Currently vmalloc is used for the zone wait table possibly generating the need
to generate lots of TLBs to access the tables. We can now use GFP_VFALLBACK to
attempt the use of a physically contiguous page that can then use the large
kernel TLBs.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 mm/page_alloc.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/page_alloc.c
===
--- linux-2.6.orig/mm/page_alloc.c  2007-09-24 18:48:06.0 -0700
+++ linux-2.6/mm/page_alloc.c   2007-09-24 18:48:16.0 -0700
@@ -2550,7 +2550,9 @@ int zone_wait_table_init(struct zone *zo
 * To use this new node's memory, further consideration will be
 * necessary.
 */
-   zone-wait_table = (wait_queue_head_t *)vmalloc(alloc_size);
+   zone-wait_table = (wait_queue_head_t *)
+   __get_free_pages(GFP_VFALLBACK,
+   get_order(alloc_size));
}
if (!zone-wait_table)
return -ENOMEM;

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[14/17] Allow bit_waitqueue to wait on a bit in a vmalloc area

2007-09-25 Thread Christoph Lameter
If bit waitqueue is passed a virtual address then it must use
vmalloc_to_page instead of virt_to_page to get to the page struct.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 kernel/wait.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/kernel/wait.c
===
--- linux-2.6.orig/kernel/wait.c2007-09-20 19:03:42.0 -0700
+++ linux-2.6/kernel/wait.c 2007-09-20 19:07:42.0 -0700
@@ -245,7 +245,7 @@ EXPORT_SYMBOL(wake_up_bit);
 fastcall wait_queue_head_t *bit_waitqueue(void *word, int bit)
 {
const int shift = BITS_PER_LONG == 32 ? 5 : 6;
-   const struct zone *zone = page_zone(virt_to_page(word));
+   const struct zone *zone = page_zone(addr_to_page(word));
unsigned long val = (unsigned long)word  shift | bit;
 
return zone-wait_table[hash_long(val, zone-wait_table_bits)];

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[16/17] Allow virtual fallback for buffer_heads

2007-09-25 Thread Christoph Lameter
This is in particular useful for large I/Os because it will allow  100
allocs from the SLUB fast path without having to go to the page allocator.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/buffer.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6/fs/buffer.c
===
--- linux-2.6.orig/fs/buffer.c  2007-09-18 15:44:37.0 -0700
+++ linux-2.6/fs/buffer.c   2007-09-18 15:44:51.0 -0700
@@ -3008,7 +3008,8 @@ void __init buffer_init(void)
int nrpages;
 
bh_cachep = KMEM_CACHE(buffer_head,
-   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
+   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|
+   SLAB_VFALLBACK);
 
/*
 * Limit the bh occupancy to 10% of ZONE_NORMAL

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[17/17] Allow virtual fallback for dentries

2007-09-25 Thread Christoph Lameter
Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 fs/dcache.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6/fs/dcache.c
===
--- linux-2.6.orig/fs/dcache.c  2007-09-24 16:47:43.0 -0700
+++ linux-2.6/fs/dcache.c   2007-09-24 17:03:15.0 -0700
@@ -2118,7 +2118,8 @@ static void __init dcache_init(unsigned 
 * of the dcache. 
 */
dentry_cache = KMEM_CACHE(dentry,
-   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD);
+   SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD|
+   SLAB_VFALLBACK);

register_shrinker(dcache_shrinker);
 

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-25 Thread Christoph Lameter
SLAB_VFALLBACK can be specified for selected slab caches. If fallback is
available then the conservative settings for higher order allocations are
overridden. We then request an order that can accomodate at mininum
100 objects. The size of an individual slab allocation is allowed to reach
up to 256k (order 6 on i386, order 4 on IA64).

Implementing fallback requires special handling of virtual mappings in
the free path. However, the impact is minimal since we already check the
address if its NULL or ZERO_SIZE_PTR. No additional cachelines are
touched if we do not fall back. However, if we need to handle a virtual
compound page then walk the kernel page table in the free paths to
determine the page struct.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/slab.h |1 
 include/linux/slub_def.h |1 
 mm/slub.c|   52 +++
 3 files changed, 32 insertions(+), 22 deletions(-)

Index: linux-2.6/include/linux/slab.h
===
--- linux-2.6.orig/include/linux/slab.h 2007-09-24 20:34:14.0 -0700
+++ linux-2.6/include/linux/slab.h  2007-09-24 20:35:09.0 -0700
@@ -19,6 +19,7 @@
  * The ones marked DEBUG are only valid if CONFIG_SLAB_DEBUG is set.
  */
 #define SLAB_DEBUG_FREE0x0100UL/* DEBUG: Perform 
(expensive) checks on free */
+#define SLAB_VFALLBACK 0x0200UL/* May fall back to vmalloc */
 #define SLAB_RED_ZONE  0x0400UL/* DEBUG: Red zone objs in a 
cache */
 #define SLAB_POISON0x0800UL/* DEBUG: Poison objects */
 #define SLAB_HWCACHE_ALIGN 0x2000UL/* Align objs on cache lines */
Index: linux-2.6/mm/slub.c
===
--- linux-2.6.orig/mm/slub.c2007-09-24 20:34:14.0 -0700
+++ linux-2.6/mm/slub.c 2007-09-24 20:35:09.0 -0700
@@ -285,7 +285,7 @@ static inline int check_valid_pointer(st
if (!object)
return 1;
 
-   base = page_address(page);
+   base = page_to_addr(page);
if (object  base || object = base + s-objects * s-size ||
(object - base) % s-size) {
return 0;
@@ -470,7 +470,7 @@ static void slab_fix(struct kmem_cache *
 static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p)
 {
unsigned int off;   /* Offset of last byte */
-   u8 *addr = page_address(page);
+   u8 *addr = page_to_addr(page);
 
print_tracking(s, p);
 
@@ -648,7 +648,7 @@ static int slab_pad_check(struct kmem_ca
if (!(s-flags  SLAB_POISON))
return 1;
 
-   start = page_address(page);
+   start = page_to_addr(page);
end = start + (PAGE_SIZE  s-order);
length = s-objects * s-size;
remainder = end - (start + length);
@@ -1049,11 +1049,7 @@ static struct page *allocate_slab(struct
struct page * page;
int pages = 1  s-order;
 
-   if (s-order)
-   flags |= __GFP_COMP;
-
-   if (s-flags  SLAB_CACHE_DMA)
-   flags |= SLUB_DMA;
+   flags |= s-gfpflags;
 
if (node == -1)
page = alloc_pages(flags, s-order);
@@ -1107,7 +1103,7 @@ static struct page *new_slab(struct kmem
SLAB_STORE_USER | SLAB_TRACE))
SetSlabDebug(page);
 
-   start = page_address(page);
+   start = page_to_addr(page);
end = start + s-objects * s-size;
 
if (unlikely(s-flags  SLAB_POISON))
@@ -1139,7 +1135,7 @@ static void __free_slab(struct kmem_cach
void *p;
 
slab_pad_check(s, page);
-   for_each_object(p, s, page_address(page))
+   for_each_object(p, s, page_to_addr(page))
check_object(s, page, p, 0);
ClearSlabDebug(page);
}
@@ -1789,10 +1785,9 @@ static inline int slab_order(int size, i
return order;
 }
 
-static inline int calculate_order(int size)
+static inline int calculate_order(int size, int min_objects, int max_order)
 {
int order;
-   int min_objects;
int fraction;
 
/*
@@ -1803,13 +1798,12 @@ static inline int calculate_order(int si
 * First we reduce the acceptable waste in a slab. Then
 * we reduce the minimum objects required in a slab.
 */
-   min_objects = slub_min_objects;
while (min_objects  1) {
fraction = 8;
while (fraction = 4) {
order = slab_order(size, min_objects,
-   slub_max_order, fraction);
-   if (order = slub_max_order)
+   max_order, fraction);
+   if (order = max_order)
return order;
fraction /= 2;

[13/17] Virtual compound page freeing in interrupt context

2007-09-25 Thread Christoph Lameter
If we are in an interrupt context then simply defer the free via a workqueue.

Removing a virtual mappping *must* be done with interrupts enabled
since tlb_xx functions are called that rely on interrupts for
processor to processor communications.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 mm/page_alloc.c |   23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

Index: linux-2.6/mm/page_alloc.c
===
--- linux-2.6.orig/mm/page_alloc.c  2007-09-25 00:20:56.0 -0700
+++ linux-2.6/mm/page_alloc.c   2007-09-25 00:20:57.0 -0700
@@ -1294,7 +1294,12 @@ abort:
return NULL;
 }
 
-static void vcompound_free(void *addr)
+/*
+ * Virtual Compound freeing functions. This is complicated by the vmalloc
+ * layer not being able to free virtual allocations when interrupts are
+ * disabled. So we defer the frees via a workqueue if necessary.
+ */
+static void __vcompound_free(void *addr)
 {
struct page **pages;
int i;
@@ -1319,6 +1324,22 @@ static void vcompound_free(void *addr)
kfree(pages);
 }
 
+static void vcompound_free_work(struct work_struct *w)
+{
+   __vcompound_free((void *)w);
+}
+
+static noinline void vcompound_free(void *addr)
+{
+   if (in_interrupt()) {
+   struct work_struct *w = addr;
+
+   INIT_WORK(w, vcompound_free_work);
+   schedule_work(w);
+   } else
+   __vcompound_free(addr);
+}
+
 /*
  * This is the 'heart' of the zoned buddy allocator.
  */

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[10/17] Use GFP_VFALLBACK for sparsemem.

2007-09-25 Thread Christoph Lameter
Sparsemem currently attempts first to do a physically contiguous mapping
and then falls back to vmalloc. The same thing can now be accomplished
using GFP_VFALLBACK.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 mm/sparse.c |   23 +++
 1 file changed, 3 insertions(+), 20 deletions(-)

Index: linux-2.6/mm/sparse.c
===
--- linux-2.6.orig/mm/sparse.c  2007-09-19 18:05:34.0 -0700
+++ linux-2.6/mm/sparse.c   2007-09-19 18:27:25.0 -0700
@@ -269,32 +269,15 @@ void __init sparse_init(void)
 #ifdef CONFIG_MEMORY_HOTPLUG
 static struct page *__kmalloc_section_memmap(unsigned long nr_pages)
 {
-   struct page *page, *ret;
unsigned long memmap_size = sizeof(struct page) * nr_pages;
 
-   page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size));
-   if (page)
-   goto got_map_page;
-
-   ret = vmalloc(memmap_size);
-   if (ret)
-   goto got_map_ptr;
-
-   return NULL;
-got_map_page:
-   ret = (struct page *)pfn_to_kaddr(page_to_pfn(page));
-got_map_ptr:
-   memset(ret, 0, memmap_size);
-
-   return ret;
+   return (struct page *)__get_free_pages(GFP_VFALLBACK,
+   get_order(memmap_size));
 }
 
 static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages)
 {
-   if (is_vmalloc_addr(memmap))
-   vfree(memmap);
-   else
-   free_pages((unsigned long)memmap,
+   free_pages((unsigned long)memmap,
   get_order(sizeof(struct page) * nr_pages));
 }
 

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[09/17] VFALLBACK: Debugging aid

2007-09-25 Thread Christoph Lameter
Virtual fallbacks are rare and thus subtle bugs may creep in if we do not
test the fallbacks. CONFIG_VFALLBACK_ALWAYS makes all GFP_VFALLBACK
allocations fall back to virtual mapping.

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 lib/Kconfig.debug |   11 +++
 mm/page_alloc.c   |6 ++
 2 files changed, 17 insertions(+)

Index: linux-2.6/mm/page_alloc.c
===
--- linux-2.6.orig/mm/page_alloc.c  2007-09-24 18:48:03.0 -0700
+++ linux-2.6/mm/page_alloc.c   2007-09-24 18:58:52.0 -0700
@@ -1208,6 +1208,12 @@ zonelist_scan:
}
}
 
+#ifdef CONFIG_VFALLBACK_ALWAYS
+   if ((gfp_mask  __GFP_VFALLBACK) 
+   system_state == SYSTEM_RUNNING)
+   return vcompound_alloc(gfp_mask, order,
+   zonelist, alloc_flags);
+#endif
page = buffered_rmqueue(zonelist, zone, order, gfp_mask);
if (page)
break;
Index: linux-2.6/lib/Kconfig.debug
===
--- linux-2.6.orig/lib/Kconfig.debug2007-09-24 18:30:45.0 -0700
+++ linux-2.6/lib/Kconfig.debug 2007-09-24 18:48:06.0 -0700
@@ -105,6 +105,17 @@ config DETECT_SOFTLOCKUP
   can be detected via the NMI-watchdog, on platforms that
   support it.)
 
+config VFALLBACK_ALWAYS
+   bool Always fall back to Virtual Compound pages
+   default y
+   help
+ Virtual compound pages are only allocated if there is no linear
+ memory available. They are a fallback and errors created by the
+ use of virtual mappings instead of linear ones may not surface
+ because of their infrequent use. This option makes every
+ allocation that allows a fallback to a virtual mapping use
+ the virtual mapping. May have a significant performance impact.
+
 config SCHED_DEBUG
bool Collect scheduler debugging info
depends on DEBUG_KERNEL  PROC_FS

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[12/17] Virtual Compound page allocation from interrupt context.

2007-09-25 Thread Christoph Lameter
In an interrupt context we cannot wait for the vmlist_lock in
__get_vm_area_node(). So use a trylock instead. If the trylock fails
then the atomic allocation will fail and subsequently be retried.

This only works because the flush_cache_vunmap in use for
allocation is never performing any IPIs in contrast to flush_tlb_...
in use for freeing.  flush_cache_vunmap is only used on architectures
with a virtually mapped cache (xtensa, pa-risc).

[Note: Nick Piggin is working on a scheme to make this simpler by
no longer requiring flushes]

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 mm/vmalloc.c |   10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

Index: linux-2.6/mm/vmalloc.c
===
--- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:03:49.0 -0700
+++ linux-2.6/mm/vmalloc.c  2007-09-24 16:04:32.0 -0700
@@ -289,7 +289,6 @@ static struct vm_struct *__get_vm_area_n
unsigned long align = 1;
unsigned long addr;
 
-   BUG_ON(in_interrupt());
if (flags  VM_IOREMAP) {
int bit = fls(size);
 
@@ -314,7 +313,14 @@ static struct vm_struct *__get_vm_area_n
 */
size += PAGE_SIZE;
 
-   write_lock(vmlist_lock);
+   if (gfp_mask  __GFP_WAIT)
+   write_lock(vmlist_lock);
+   else {
+   if (!write_trylock(vmlist_lock)) {
+   kfree(area);
+   return NULL;
+   }
+   }
for (p = vmlist; (tmp = *p) != NULL ;p = tmp-next) {
if ((unsigned long)tmp-addr  addr) {
if((unsigned long)tmp-addr + tmp-size = addr)

-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[08/17] GFP_VFALLBACK: Allow fallback of compound pages to virtual mappings

2007-09-25 Thread Christoph Lameter
Add a new gfp flag

__GFP_VFALLBACK

If specified during a higher order allocation then the system will fall
back to vmap and attempt to create a virtually contiguous area instead of
a physically contiguous area. In many cases the virtually contiguous area
can stand in for the physically contiguous area (with some loss of
performance).

The pages used for VFALLBACK are marked with a new flag
PageVcompound(page). The mark is necessary since we have to know upon
free if we have to destroy a virtual mapping. No additional flag is
consumed through the use of PG_swapcache together with PG_compound
(similar to PageHead() and PageTail()).

Also add a new function

compound_nth_page(page, n)

to find the nth page of a compound page. For real compound pages
this simply reduces to page + n. For virtual compound pages we need to consult
the page tables to figure out the nth page.

Add new page to address and vice versa functions.

struct page *addr_to_page(const void *address);
void *page_to_addr(struct page *);

The new conversion functions allow the conversion of vmalloc areas to
the corresponding page structs that back it and vice versa. If the
addresses or the page struct is not part of a vmalloc function then
fall back to virt_to_page and page_address().

Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

---
 include/linux/gfp.h|5 +
 include/linux/mm.h |   33 +--
 include/linux/page-flags.h |   18 ++
 mm/page_alloc.c|  131 -
 mm/vmalloc.c   |   10 +++
 5 files changed, 179 insertions(+), 18 deletions(-)

Index: linux-2.6/mm/page_alloc.c
===
--- linux-2.6.orig/mm/page_alloc.c  2007-09-25 10:22:16.0 -0700
+++ linux-2.6/mm/page_alloc.c   2007-09-25 10:22:36.0 -0700
@@ -60,6 +60,8 @@ long nr_swap_pages;
 int percpu_pagelist_fraction;
 
 static void __free_pages_ok(struct page *page, unsigned int order);
+static struct page *vcompound_alloc(gfp_t, int,
+   struct zonelist *, unsigned long);
 
 /*
  * results with 256, 32 in the lowmem_reserve sysctl:
@@ -251,7 +253,7 @@ static void prep_compound_page(struct pa
set_compound_order(page, order);
__SetPageHead(page);
for (i = 1; i  nr_pages; i++) {
-   struct page *p = page + i;
+   struct page *p = compound_nth_page(page, i);
 
__SetPageTail(p);
p-first_page = page;
@@ -266,17 +268,23 @@ static void destroy_compound_page(struct
if (unlikely(compound_order(page) != order))
bad_page(page);
 
-   if (unlikely(!PageHead(page)))
-   bad_page(page);
-   __ClearPageHead(page);
for (i = 1; i  nr_pages; i++) {
-   struct page *p = page + i;
+   struct page *p = compound_nth_page(page,  i);
 
if (unlikely(!PageTail(p) |
(p-first_page != page)))
bad_page(page);
__ClearPageTail(p);
}
+
+   /*
+* The PageHead is important since it determines how operations on
+* a compound page have to be performed. We can only tear the head
+* down after all the tail pages are done.
+*/
+   if (unlikely(!PageHead(page)))
+   bad_page(page);
+   __ClearPageHead(page);
 }
 
 static inline void prep_zero_page(struct page *page, int order, gfp_t 
gfp_flags)
@@ -1230,6 +1238,82 @@ try_next_zone:
 }
 
 /*
+ * Virtual Compound Page support.
+ *
+ * Virtual Compound Pages are used to fall back to order 0 allocations if large
+ * linear mappings are not available and __GFP_VFALLBACK is set. They are
+ * formatted according to compound page conventions. I.e. following
+ * page-first_page if PageTail(page) is set can be used to determine the
+ * head page.
+ */
+static noinline struct page *vcompound_alloc(gfp_t gfp_mask, int order,
+   struct zonelist *zonelist, unsigned long alloc_flags)
+{
+   void *addr;
+   struct page *page;
+   int i;
+   int nr_pages = 1  order;
+   struct page **pages = kmalloc(nr_pages * sizeof(struct page *),
+   gfp_mask  GFP_LEVEL_MASK);
+
+   if (!pages)
+   return NULL;
+
+   for (i = 0; i  nr_pages; i++) {
+   page = get_page_from_freelist(gfp_mask  ~__GFP_VFALLBACK,
+   0, zonelist, alloc_flags);
+   if (!page)
+   goto abort;
+
+   /* Sets PageCompound which makes PageHead(page) true */
+   __SetPageVcompound(page);
+   pages[i] = page;
+   }
+   addr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL);
+   if (!addr)
+   goto abort;
+
+   prep_compound_page(pages[0], order);
+  

Re: [patch 2/4] ext2: fix rec_len overflow for 64KB block size

2007-09-25 Thread Andreas Dilger
On Sep 25, 2007  16:30 -0700, Christoph Lameter wrote:
 [2/4]  ext2: fix rec_len overflow
  - prevent rec_len from overflow with 64KB blocksize
 
 Signed-off-by: Takashi Sato [EMAIL PROTECTED]
 Signed-off-by: Mingming Cao [EMAIL PROTECTED]
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]

Note that we just got a cleaner implemantation of this code on the ext4
mailing list from Jan Kara yesterday.  Please use that one instead, in
thread Avoid rec_len overflow with 64KB block size instead.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[GIT PULL -mm] 00/25 Unionfs updates/cleanups/fixes

2007-09-25 Thread Erez Zadok

The following is a series of patches related to Unionfs.  Aside from a few
minor cleanups/fixes, the two main changes are (1) lower nameidata support
so we can stack on nfsv4, and (2) un/likely optimizations.  These patches
were tested (where appropriate) on our 2.6.23-rc8 latest code, as well as
the backports to 2.6.{22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs,
nfs2/3/4, jffs2, ramfs, tmpfs, cramfs, and squashfs (where available).  See
http://unionfs.filesystems.org/ to download backported unionfs code.

Please pull from the 'master' branch of
git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git

to receive the following:

Erez Zadok (22):
  Unionfs: display informational messages only if debug is on
  Unionfs: cast page-index loff_t before shifting
  Unionfs: minor coding style updates
  Unionfs: add lower nameidata debugging support
  Unionfs: lower nameidata support for nfsv4
  Unionfs: add un/likely conditionals on common fileops
  Unionfs: add un/likely conditionals on copyup ops
  Unionfs: add un/likely conditionals on debug ops
  Unionfs: add un/likely conditionals on dentry ops
  Unionfs: add un/likely conditionals on dir ops
  Unionfs: add un/likely conditionals on headers
  Unionfs: add un/likely conditionals on fileops
  Unionfs: add un/likely conditionals on inode ops
  Unionfs: add un/likely conditionals on lookup ops
  Unionfs: add un/likely conditionals on super ops
  Unionfs: add un/likely conditionals on mmap ops
  Unionfs: add un/likely conditionals on rename ops
  Unionfs: add un/likely conditionals on readdir ops
  Unionfs: add un/likely conditionals on common subr
  Unionfs: add un/likely conditionals on unlink ops
  Unionfs: add un/likely conditionals on xattr ops
  Unionfs: use poison.h for safe poison pointers

Josef 'Jeff' Sipek (2):
  Unionfs: Simplify unionfs_get_nlinks
  Unionfs: Remove unused #defines

Olivier Blin (1):
  Unionfs: cache-coherency fixes

 commonfops.c |   98 +++
 copyup.c |  102 
 debug.c  |  140 ++--
 dentry.c |   87 +++
 dirfops.c|   22 +++---
 dirhelper.c  |   30 -
 fanout.h |   13 ++--
 file.c   |   38 ++--
 inode.c  |  186 +++
 lookup.c |   60 +++
 main.c   |  102 
 mmap.c   |   33 +-
 rdstate.c|   15 ++--
 rename.c |   96 +++---
 sioq.c   |4 -
 subr.c   |   67 ++---
 super.c  |   90 ++--
 union.h  |   19 +++---
 unlink.c |   32 +-
 xattr.c  |   12 +--
 20 files changed, 647 insertions(+), 599 deletions(-)

---
Erez Zadok
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 05/25] Unionfs: cast page-index loff_t before shifting

2007-09-25 Thread Erez Zadok
Fixes bugs in number promotion/demotion computation, as per
http://lkml.org/lkml/2007/9/20/17

Signed-off-by: Erez Zadok [EMAIL PROTECTED]
Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
---
 fs/unionfs/mmap.c |5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 88ef6a6..37af979 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -179,7 +179,8 @@ static int unionfs_do_readpage(struct file *file, struct 
page *page)
 * may be a little slower, but a lot safer, as the VFS does a lot of
 * the necessary magic for us.
 */
-   offset = lower_file-f_pos = (page-index  PAGE_CACHE_SHIFT);
+   offset = lower_file-f_pos =
+   ((loff_t) page-index  PAGE_CACHE_SHIFT);
old_fs = get_fs();
set_fs(KERNEL_DS);
err = vfs_read(lower_file, page_data, PAGE_CACHE_SIZE,
@@ -289,7 +290,7 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
BUG_ON(lower_file == NULL);
 
page_data = (char *)kmap(page);
-   lower_file-f_pos = (page-index  PAGE_CACHE_SHIFT) + from;
+   lower_file-f_pos = ((loff_t) page-index  PAGE_CACHE_SHIFT) + from;
 
/*
 * SP: I use vfs_write instead of copying page data and the
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 24/25] Unionfs: add un/likely conditionals on xattr ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/xattr.c |   12 ++--
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c
index 7f77d7d..bd2de06 100644
--- a/fs/unionfs/xattr.c
+++ b/fs/unionfs/xattr.c
@@ -23,14 +23,14 @@ void *unionfs_xattr_alloc(size_t size, size_t limit)
 {
void *ptr;
 
-   if (size  limit)
+   if (unlikely(size  limit))
return ERR_PTR(-E2BIG);
 
if (!size)  /* size request, no buffer is needed */
return NULL;
 
ptr = kmalloc(size, GFP_KERNEL);
-   if (!ptr)
+   if (unlikely(!ptr))
return ERR_PTR(-ENOMEM);
return ptr;
 }
@@ -48,7 +48,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char 
*name, void *value,
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
@@ -77,7 +77,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name,
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
@@ -106,7 +106,7 @@ int unionfs_removexattr(struct dentry *dentry, const char 
*name)
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
@@ -135,7 +135,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char 
*list, size_t size)
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 14/25] Unionfs: add un/likely conditionals on headers

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/fanout.h |   13 -
 fs/unionfs/union.h  |4 ++--
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index 51aa0de..6405399 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -308,17 +308,20 @@ static inline void unionfs_copy_attr_times(struct inode 
*upper)
int bindex;
struct inode *lower;
 
-   if (!upper || ibstart(upper)  0)
+   if (unlikely(!upper || ibstart(upper)  0))
return;
for (bindex=ibstart(upper); bindex = ibend(upper); bindex++) {
lower = unionfs_lower_inode_idx(upper, bindex);
-   if (!lower)
+   if (unlikely(!lower))
continue; /* not all lower dir objects may exist */
-   if (timespec_compare(upper-i_mtime, lower-i_mtime)  0)
+   if (unlikely(timespec_compare(upper-i_mtime,
+ lower-i_mtime)  0))
upper-i_mtime = lower-i_mtime;
-   if (timespec_compare(upper-i_ctime, lower-i_ctime)  0)
+   if (likely(timespec_compare(upper-i_ctime,
+   lower-i_ctime)  0))
upper-i_ctime = lower-i_ctime;
-   if (timespec_compare(upper-i_atime, lower-i_atime)  0)
+   if (likely(timespec_compare(upper-i_atime,
+   lower-i_atime)  0))
upper-i_atime = lower-i_atime;
}
 }
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index d27844d..8df44a9 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -472,7 +472,7 @@ static inline struct vfsmount *unionfs_mntget(struct dentry 
*dentry,
 
mnt = mntget(unionfs_lower_mnt_idx(dentry, bindex));
 #ifdef CONFIG_UNION_FS_DEBUG
-   if (!mnt)
+   if (unlikely(!mnt))
printk(KERN_DEBUG unionfs_mntget: mnt=%p bindex=%d\n,
   mnt, bindex);
 #endif /* CONFIG_UNION_FS_DEBUG */
@@ -484,7 +484,7 @@ static inline void unionfs_mntput(struct dentry *dentry, 
int bindex)
 {
struct vfsmount *mnt;
 
-   if (!dentry  bindex  0)
+   if (unlikely(!dentry  bindex  0))
return;
BUG_ON(!dentry || bindex  0);
 
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 03/25] Unionfs: display informational messages only if debug is on

2007-09-25 Thread Erez Zadok
This is to avoid filling the console/logs with messages that are primarily
of debugging use.

Signed-off-by: Erez Zadok [EMAIL PROTECTED]
Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
---
 fs/unionfs/commonfops.c |4 ++--
 fs/unionfs/dentry.c |6 +++---
 fs/unionfs/union.h  |4 
 3 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index 87cbb09..e69ccf6 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -394,8 +394,8 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
if (willwrite  IS_WRITE_FLAG(file-f_flags) 
!IS_WRITE_FLAG(unionfs_lower_file(file)-f_flags) 
is_robranch(dentry)) {
-   printk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
-  dentry-d_name.name);
+   dprintk(KERN_DEBUG unionfs: do delay copyup of \%s\\n,
+   dentry-d_name.name);
err = do_delayed_copyup(file);
}
 
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 9e0742d..08b5722 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -46,9 +46,9 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
 
/* if the dentry is unhashed, do NOT revalidate */
if (d_deleted(dentry)) {
-   printk(KERN_DEBUG unionfs: unhashed dentry being 
-  revalidated: %*s\n,
-  dentry-d_name.len, dentry-d_name.name);
+   dprintk(KERN_DEBUG unionfs: unhashed dentry being 
+   revalidated: %*s\n,
+   dentry-d_name.len, dentry-d_name.name);
goto out;
}
 
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 140b8ae..5e9843b 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -507,6 +507,8 @@ static inline void unionfs_mntput(struct dentry *dentry, 
int bindex)
 
 #ifdef CONFIG_UNION_FS_DEBUG
 
+#define dprintk(args...)   printk(args)
+
 /* useful for tracking code reachability */
 #define UDBG printk(DBG:%s:%s:%d\n,__FILE__,__FUNCTION__,__LINE__)
 
@@ -543,6 +545,8 @@ extern void __show_inode_counts(const struct inode *inode,
 
 #else /* not CONFIG_UNION_FS_DEBUG */
 
+#define dprintk(args...)   do { } while (0)
+
 /* we leave useful hooks for these check functions throughout the code */
 #define unionfs_check_inode(i) do { } while(0)
 #define unionfs_check_dentry(d)do { } while(0)
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 21/25] Unionfs: add un/likely conditionals on readdir ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/rdstate.c |   15 ---
 1 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c
index 0a18d5c..7ec7f95 100644
--- a/fs/unionfs/rdstate.c
+++ b/fs/unionfs/rdstate.c
@@ -45,7 +45,7 @@ int unionfs_init_filldir_cache(void)
 
 void unionfs_destroy_filldir_cache(void)
 {
-   if (unionfs_filldir_cachep)
+   if (likely(unionfs_filldir_cachep))
kmem_cache_destroy(unionfs_filldir_cachep);
 }
 
@@ -72,7 +72,8 @@ static int guesstimate_hash_size(struct inode *inode)
return UNIONFS_I(inode)-hashsize;
 
for (bindex = ibstart(inode); bindex = ibend(inode); bindex++) {
-   if (!(lower_inode = unionfs_lower_inode_idx(inode, bindex)))
+   lower_inode = unionfs_lower_inode_idx(inode, bindex);
+   if (unlikely(!lower_inode))
continue;
 
if (lower_inode-i_size == DENTPAGE)
@@ -136,7 +137,7 @@ struct unionfs_dir_state *alloc_rdstate(struct inode 
*inode, int bindex)
sizeof(struct list_head);
 
rdstate = kmalloc(mallocsize, GFP_KERNEL);
-   if (!rdstate)
+   if (unlikely(!rdstate))
return NULL;
 
spin_lock(UNIONFS_I(inode)-rdlock);
@@ -217,7 +218,7 @@ struct filldir_node *find_filldir_node(struct 
unionfs_dir_state *rdstate,
 * if the duplicate is in this branch, then the file
 * system is corrupted.
 */
-   if (cursor-bindex == rdstate-bindex) {
+   if (unlikely(cursor-bindex == rdstate-bindex)) {
printk(KERN_DEBUG unionfs: filldir: possible 
   I/O error: a file is duplicated 
   in the same branch %d: %s\n,
@@ -227,7 +228,7 @@ struct filldir_node *find_filldir_node(struct 
unionfs_dir_state *rdstate,
}
}
 
-   if (!found)
+   if (unlikely(!found))
cursor = NULL;
 
return cursor;
@@ -249,7 +250,7 @@ int add_filldir_node(struct unionfs_dir_state *rdstate, 
const char *name,
head = (rdstate-list[index]);
 
new = kmem_cache_alloc(unionfs_filldir_cachep, GFP_KERNEL);
-   if (!new) {
+   if (unlikely(!new)) {
err = -ENOMEM;
goto out;
}
@@ -264,7 +265,7 @@ int add_filldir_node(struct unionfs_dir_state *rdstate, 
const char *name,
new-name = new-iname;
else {
new-name = kmalloc(namelen + 1, GFP_KERNEL);
-   if (!new-name) {
+   if (unlikely(!new-name)) {
kmem_cache_free(unionfs_filldir_cachep, new);
new = NULL;
goto out;
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 19/25] Unionfs: add un/likely conditionals on mmap ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/mmap.c |   28 ++--
 1 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c
index 37af979..1cea075 100644
--- a/fs/unionfs/mmap.c
+++ b/fs/unionfs/mmap.c
@@ -84,7 +84,7 @@ static int unionfs_writepage(struct page *page, struct 
writeback_control *wbc)
 * resort to RAIF's page pointer flipping trick.)
 */
lower_page = find_lock_page(lower_inode-i_mapping, page-index);
-   if (!lower_page) {
+   if (unlikely(!lower_page)) {
err = AOP_WRITEPAGE_ACTIVATE;
set_page_dirty(page);
goto out;
@@ -102,7 +102,7 @@ static int unionfs_writepage(struct page *page, struct 
writeback_control *wbc)
BUG_ON(!lower_inode-i_mapping-a_ops-writepage);
 
/* workaround for some lower file systems: see big comment on top */
-   if (wbc-for_writepages  !wbc-fs_private)
+   if (unlikely(wbc-for_writepages  !wbc-fs_private))
wbc-for_writepages = 0;
 
/* call lower writepage (expects locked page) */
@@ -111,12 +111,12 @@ static int unionfs_writepage(struct page *page, struct 
writeback_control *wbc)
wbc-for_writepages = saved_for_writepages; /* restore value */
 
/* b/c find_lock_page locked it and -writepage unlocks on success */
-   if (err)
+   if (unlikely(err))
unlock_page(lower_page);
/* b/c grab_cache_page increased refcnt */
page_cache_release(lower_page);
 
-   if (err  0) {
+   if (unlikely(err  0)) {
ClearPageUptodate(page);
goto out;
}
@@ -160,7 +160,7 @@ static int unionfs_do_readpage(struct file *file, struct 
page *page)
char *page_data = NULL;
loff_t offset;
 
-   if (!UNIONFS_F(file)) {
+   if (unlikely(!UNIONFS_F(file))) {
err = -ENOENT;
goto out;
}
@@ -189,7 +189,7 @@ static int unionfs_do_readpage(struct file *file, struct 
page *page)
 
kunmap(page);
 
-   if (err  0)
+   if (unlikely(err  0))
goto out;
err = 0;
 
@@ -199,7 +199,7 @@ static int unionfs_do_readpage(struct file *file, struct 
page *page)
flush_dcache_page(page);
 
 out:
-   if (err == 0)
+   if (likely(err == 0))
SetPageUptodate(page);
else
ClearPageUptodate(page);
@@ -212,13 +212,13 @@ static int unionfs_readpage(struct file *file, struct 
page *page)
int err;
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, false)))
+   if (unlikely((err = unionfs_file_revalidate(file, false
goto out;
unionfs_check_file(file);
 
err = unionfs_do_readpage(file, page);
 
-   if (!err) {
+   if (likely(!err)) {
touch_atime(unionfs_lower_mnt(file-f_path.dentry),
unionfs_lower_dentry(file-f_path.dentry));
unionfs_copy_attr_times(file-f_path.dentry-d_inode);
@@ -276,14 +276,14 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
BUG_ON(file == NULL);
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, true)))
+   if (unlikely((err = unionfs_file_revalidate(file, true
goto out;
unionfs_check_file(file);
 
inode = page-mapping-host;
lower_inode = unionfs_lower_inode(inode);
 
-   if (UNIONFS_F(file) != NULL)
+   if (likely(UNIONFS_F(file) != NULL))
lower_file = unionfs_lower_file(file);
 
/* FIXME: is this assertion right here? */
@@ -307,7 +307,7 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
 
kunmap(page);
 
-   if (err  0)
+   if (unlikely(err  0))
goto out;
 
inode-i_blocks = lower_inode-i_blocks;
@@ -320,7 +320,7 @@ static int unionfs_commit_write(struct file *file, struct 
page *page,
mark_inode_dirty_sync(inode);
 
 out:
-   if (err  0)
+   if (unlikely(err  0))
ClearPageUptodate(page);
 
unionfs_read_unlock(file-f_path.dentry-d_sb);
@@ -347,7 +347,7 @@ static void unionfs_sync_page(struct page *page)
 * do is ensure that pending I/O gets done.
 */
lower_page = find_lock_page(lower_inode-i_mapping, page-index);
-   if (!lower_page) {
+   if (unlikely(!lower_page)) {
printk(KERN_DEBUG unionfs: find_lock_page failed\n);
goto out;
}
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/25] Unionfs: lower nameidata support for nfsv4

2007-09-25 Thread Erez Zadok
Pass nameidata structures as needed to the lower file system, support
LOOKUP_ACCESS/OPEN intents.  This makes unionfs work on top of nfsv4.

Signed-off-by: Erez Zadok [EMAIL PROTECTED]
Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
---
 fs/unionfs/dentry.c |   11 +--
 fs/unionfs/inode.c  |8 +++-
 fs/unionfs/lookup.c |   20 +---
 3 files changed, 33 insertions(+), 6 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index b21f1e3..52bcb18 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -156,8 +156,15 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
if (!lower_dentry || !lower_dentry-d_op
|| !lower_dentry-d_op-d_revalidate)
continue;
-   if (!lower_dentry-d_op-d_revalidate(lower_dentry,
- lowernd))
+   /*
+* Don't pass nameidata to lower file system, because we
+* don't want an arbitrary lower file being opened or
+* returned to us: it may be useless to us because of the
+* fanout nature of unionfs (cf. file/directory open-file
+* invariants).  We will open lower files as and when needed
+* later on.
+*/
+   if (!lower_dentry-d_op-d_revalidate(lower_dentry, NULL))
valid = false;
}
 
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index f8b2c88..7ee4760 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -27,6 +27,7 @@ static int unionfs_create(struct inode *parent, struct dentry 
*dentry,
struct dentry *lower_parent_dentry = NULL;
char *name = NULL;
int valid = 0;
+   struct nameidata lower_nd;
 
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
@@ -113,7 +114,12 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
goto out;
}
 
-   err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode, nd);
+   err = init_lower_nd(lower_nd, LOOKUP_CREATE);
+   if (err  0)
+   goto out;
+   err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode,
+lower_nd);
+   release_lower_nd(lower_nd, err);
 
if (!err) {
err = PTR_ERR(unionfs_interpose(dentry, parent-i_sb, 0));
diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 963d622..2109714 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -583,6 +583,11 @@ void update_bstart(struct dentry *dentry)
  * Inside that nd structure, this function may also return an allocated
  * struct file (for open intents).  The caller, when done with this nd, must
  * kfree the intent file (using release_lower_nd).
+ *
+ * XXX: this code, and the callers of this code, should be redone using
+ * vfs_path_lookup() when (1) the nameidata structure is refactored into a
+ * separate intent-structure, and (2) open_namei() is broken into a VFS-only
+ * function and a method that other file systems can call.
  */
 int init_lower_nd(struct nameidata *nd, unsigned int flags)
 {
@@ -597,11 +602,16 @@ int init_lower_nd(struct nameidata *nd, unsigned int 
flags)
 #endif /* ALLOC_LOWER_ND_FILE */
 
memset(nd, 0, sizeof(struct nameidata));
+   if (!flags)
+   return err;
 
switch (flags) {
case LOOKUP_CREATE:
-   nd-flags = LOOKUP_CREATE;
-   nd-intent.open.flags = FMODE_READ | FMODE_WRITE | O_CREAT;
+   nd-intent.open.flags |= O_CREAT;
+   /* fall through: shared code for create/open cases */
+   case LOOKUP_OPEN:
+   nd-flags = flags;
+   nd-intent.open.flags |= (FMODE_READ | FMODE_WRITE);
 #ifdef ALLOC_LOWER_ND_FILE
file = kzalloc(sizeof(struct file), GFP_KERNEL);
if (!file) {
@@ -611,11 +621,15 @@ int init_lower_nd(struct nameidata *nd, unsigned int 
flags)
nd-intent.open.file = file;
 #endif /* ALLOC_LOWER_ND_FILE */
break;
+   case LOOKUP_ACCESS:
+   nd-flags = flags;
+   break;
default:
/*
 * We should never get here, for now.
 * We can add new cases here later on.
 */
+   dprintk(unionfs: unknown nameidata flag 0x%x\n, flags);
BUG();
break;
}
@@ -627,7 +641,7 @@ void release_lower_nd(struct nameidata *nd, int err)
 {
if (!nd-intent.open.file)
return;
-   if (!err)
+   else if (!err)
release_open_intent(nd);
 #ifdef ALLOC_LOWER_ND_FILE
kfree(nd-intent.open.file);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  

[PATCH 07/25] Unionfs: add lower nameidata debugging support

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/debug.c  |   20 
 fs/unionfs/dentry.c |4 +++-
 fs/unionfs/inode.c  |8 +++-
 fs/unionfs/union.h  |4 
 4 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index 2d15fb0..9546a41 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -415,6 +415,26 @@ void __unionfs_check_file(const struct file *file,
__unionfs_check_dentry(dentry,fname,fxn,line);
 }
 
+void __unionfs_check_nd(const struct nameidata *nd,
+   const char *fname, const char *fxn, int line)
+{
+   struct file *file;
+   int printed_caller = 0;
+
+   if (!nd)
+   return;
+   if (nd-flags  LOOKUP_OPEN) {
+   file = nd-intent.open.file;
+   if (file-f_path.dentry 
+   strcmp(file-f_dentry-d_sb-s_type-name, unionfs)) {
+   PRINT_CALLER(fname, fxn, line);
+   printk( CND1: lower_file of type %s\n,
+  file-f_path.dentry-d_sb-s_type-name);
+   BUG();
+   }
+   }
+}
+
 /* useful to track vfsmount leaks that could cause EBUSY on unmount */
 void __show_branch_counts(const struct super_block *sb,
  const char *file, const char *fxn, int line)
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index d9bb199..b21f1e3 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -418,8 +418,10 @@ static int unionfs_d_revalidate(struct dentry *dentry, 
struct nameidata *nd)
unionfs_lock_dentry(dentry);
err = __unionfs_d_revalidate_chain(dentry, nd, false);
unionfs_unlock_dentry(dentry);
-   if (err  0) /* true==1: dentry is valid */
+   if (err  0) { /* true==1: dentry is valid */
unionfs_check_dentry(dentry);
+   unionfs_check_nd(nd);
+   }
 
unionfs_read_unlock(dentry-d_sb);
 
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index de78e26..f8b2c88 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -138,8 +138,10 @@ out:
unionfs_read_unlock(dentry-d_sb);
 
unionfs_check_inode(parent);
-   if (!err)
+   if (!err) {
unionfs_check_dentry(dentry-d_parent);
+   unionfs_check_nd(nd);
+   }
unionfs_check_dentry(dentry);
return err;
 }
@@ -186,6 +188,7 @@ static struct dentry *unionfs_lookup(struct inode *parent,
unionfs_check_inode(parent);
unionfs_check_dentry(dentry);
unionfs_check_dentry(dentry-d_parent);
+   unionfs_check_nd(nd);
unionfs_read_unlock(dentry-d_sb);
 
return ret;
@@ -856,6 +859,7 @@ static void *unionfs_follow_link(struct dentry *dentry, 
struct nameidata *nd)
 
 out:
unionfs_check_dentry(dentry);
+   unionfs_check_nd(nd);
unionfs_read_unlock(dentry-d_sb);
return ERR_PTR(err);
 }
@@ -872,6 +876,7 @@ static void unionfs_put_link(struct dentry *dentry, struct 
nameidata *nd,
unionfs_unlock_dentry(dentry);
 
unionfs_check_dentry(dentry);
+   unionfs_check_nd(nd);
kfree(nd_get_link(nd));
unionfs_read_unlock(dentry-d_sb);
 }
@@ -1002,6 +1007,7 @@ static int unionfs_permission(struct inode *inode, int 
mask,
 
 out:
unionfs_check_inode(inode);
+   unionfs_check_nd(nd);
return err;
 }
 
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 755bc25..d27844d 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -518,6 +518,8 @@ static inline void unionfs_mntput(struct dentry *dentry, 
int bindex)
__FILE__,__FUNCTION__,__LINE__)
 #define unionfs_check_file(f)  __unionfs_check_file((f),   \
__FILE__,__FUNCTION__,__LINE__)
+#define unionfs_check_nd(n)__unionfs_check_nd((n), \
+   __FILE__,__FUNCTION__,__LINE__)
 #define show_branch_counts(sb) __show_branch_counts((sb),  \
__FILE__,__FUNCTION__,__LINE__)
 #define show_inode_times(i)__show_inode_times((i), \
@@ -534,6 +536,8 @@ extern void __unionfs_check_dentry(const struct dentry 
*dentry,
   int line);
 extern void __unionfs_check_file(const struct file *file,
 const char *fname, const char *fxn, int line);
+extern void __unionfs_check_nd(const struct nameidata *nd,
+  const char *fname, const char *fxn, int line);
 extern void __show_branch_counts(const struct super_block *sb,
 const char *file, const char *fxn, int line);
 extern void __show_inode_times(const struct inode *inode,
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 22/25] Unionfs: add un/likely conditionals on common subr

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/sioq.c |4 ++--
 fs/unionfs/subr.c |   26 +-
 2 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c
index 2a8c88e..35d9fc3 100644
--- a/fs/unionfs/sioq.c
+++ b/fs/unionfs/sioq.c
@@ -28,7 +28,7 @@ int __init init_sioq(void)
int err;
 
superio_workqueue = create_workqueue(unionfs_siod);
-   if (!IS_ERR(superio_workqueue))
+   if (unlikely(!IS_ERR(superio_workqueue)))
return 0;
 
err = PTR_ERR(superio_workqueue);
@@ -39,7 +39,7 @@ int __init init_sioq(void)
 
 void stop_sioq(void)
 {
-   if (superio_workqueue)
+   if (likely(superio_workqueue))
destroy_workqueue(superio_workqueue);
 }
 
diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index 6b93b64..6067d65 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -40,7 +40,7 @@ int create_whiteout(struct dentry *dentry, int start)
 
/* create dentry's whiteout equivalent */
name = alloc_whname(dentry-d_name.name, dentry-d_name.len);
-   if (IS_ERR(name)) {
+   if (unlikely(IS_ERR(name))) {
err = PTR_ERR(name);
goto out;
}
@@ -60,7 +60,7 @@ int create_whiteout(struct dentry *dentry, int start)
  dentry,
  dentry-d_name.name,
  bindex);
-   if (!lower_dentry || IS_ERR(lower_dentry)) {
+   if (unlikely(!lower_dentry || IS_ERR(lower_dentry))) {
printk(KERN_DEBUG unionfs: create_parents 
   failed for bindex = %d\n, bindex);
continue;
@@ -70,7 +70,7 @@ int create_whiteout(struct dentry *dentry, int start)
lower_wh_dentry =
lookup_one_len(name, lower_dentry-d_parent,
   dentry-d_name.len + UNIONFS_WHLEN);
-   if (IS_ERR(lower_wh_dentry))
+   if (unlikely(IS_ERR(lower_wh_dentry)))
continue;
 
/*
@@ -84,7 +84,7 @@ int create_whiteout(struct dentry *dentry, int start)
}
 
err = init_lower_nd(nd, LOOKUP_CREATE);
-   if (err  0)
+   if (unlikely(err  0))
goto out;
lower_dir_dentry = lock_parent(lower_wh_dentry);
if (!(err = is_robranch_super(dentry-d_sb, bindex)))
@@ -96,12 +96,12 @@ int create_whiteout(struct dentry *dentry, int start)
dput(lower_wh_dentry);
release_lower_nd(nd, err);
 
-   if (!err || !IS_COPYUP_ERR(err))
+   if (unlikely(!err || !IS_COPYUP_ERR(err)))
break;
}
 
/* set dbopaque so that lookup will not proceed after this branch */
-   if (!err)
+   if (likely(!err))
set_dbopaque(dentry, bindex);
 
 out:
@@ -129,7 +129,7 @@ int unionfs_refresh_lower_dentry(struct dentry *dentry, int 
bindex)
 
lower_dentry = lookup_one_len(dentry-d_name.name, lower_parent,
  dentry-d_name.len);
-   if (IS_ERR(lower_dentry)) {
+   if (unlikely(IS_ERR(lower_dentry))) {
err = PTR_ERR(lower_dentry);
goto out;
}
@@ -138,7 +138,7 @@ int unionfs_refresh_lower_dentry(struct dentry *dentry, int 
bindex)
iput(unionfs_lower_inode_idx(dentry-d_inode, bindex));
unionfs_set_lower_inode_idx(dentry-d_inode, bindex, NULL);
 
-   if (!lower_dentry-d_inode) {
+   if (unlikely(!lower_dentry-d_inode)) {
dput(lower_dentry);
unionfs_set_lower_dentry_idx(dentry, bindex, NULL);
} else {
@@ -166,17 +166,17 @@ int make_dir_opaque(struct dentry *dentry, int bindex)
mutex_lock(lower_dir-i_mutex);
diropq = lookup_one_len(UNIONFS_DIR_OPAQUE, lower_dentry,
sizeof(UNIONFS_DIR_OPAQUE) - 1);
-   if (IS_ERR(diropq)) {
+   if (unlikely(IS_ERR(diropq))) {
err = PTR_ERR(diropq);
goto out;
}
 
err = init_lower_nd(nd, LOOKUP_CREATE);
-   if (err  0)
+   if (unlikely(err  0))
goto out;
if (!diropq-d_inode)
err = vfs_create(lower_dir, diropq, S_IRUGO, nd);
-   if (!err)
+   if (likely(!err))
set_dbopaque(dentry, bindex);
release_lower_nd(nd, err);
 
@@ -193,7 +193,7 @@ out:
 int unionfs_get_nlinks(const struct inode *inode)
 {
/* don't bother to do all the work since we're unlinked */
-   if (inode-i_nlink == 0)
+   if (unlikely(inode-i_nlink == 0))
return 0;
 
if (!S_ISDIR(inode-i_mode))
@@ -213,7 +213,7 @@ char 

[PATCH 06/25] Unionfs: minor coding style updates

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/debug.c  |6 --
 fs/unionfs/dentry.c |2 +-
 fs/unionfs/inode.c  |   14 --
 fs/unionfs/main.c   |4 ++--
 fs/unionfs/union.h  |2 +-
 5 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index f678534..2d15fb0 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -467,7 +467,8 @@ void __show_dinode_times(const struct dentry *dentry,
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
-   printk(DT(%s:%lu:%d): , dentry-d_name.name, inode-i_ino, 
bindex);
+   printk(DT(%s:%lu:%d): , dentry-d_name.name, inode-i_ino,
+  bindex);
printk(%s:%s:%d ,file,fxn,line);
printk(um=%lu/%lu lm=%lu/%lu ,
   inode-i_mtime.tv_sec, inode-i_mtime.tv_nsec,
@@ -490,7 +491,8 @@ void __show_inode_counts(const struct inode *inode,
printk(SiC: Null inode\n);
return;
}
-   for (bindex=sbstart(inode-i_sb); bindex = sbend(inode-i_sb); 
bindex++) {
+   for (bindex=sbstart(inode-i_sb); bindex = sbend(inode-i_sb);
+bindex++) {
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (!lower_inode)
continue;
diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 08b5722..d9bb199 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -26,7 +26,7 @@
  * Returns true if valid, false otherwise.
  */
 static bool __unionfs_d_revalidate_one(struct dentry *dentry,
- struct nameidata *nd)
+  struct nameidata *nd)
 {
bool valid = true;  /* default is valid */
struct dentry *lower_dentry;
diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 9638b64..de78e26 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -99,7 +99,8 @@ static int unionfs_create(struct inode *parent, struct dentry 
*dentry,
 * if lower_dentry is NULL, create the entire
 * dentry directory structure in branch 0.
 */
-   lower_dentry = create_parents(parent, dentry, 
dentry-d_name.name, 0);
+   lower_dentry = create_parents(parent, dentry,
+ dentry-d_name.name, 0);
if (IS_ERR(lower_dentry)) {
err = PTR_ERR(lower_dentry);
goto out;
@@ -447,9 +448,8 @@ static int unionfs_symlink(struct inode *dir, struct dentry 
*dentry,
 
if (!(err = is_robranch_super(dentry-d_sb, bindex))) {
mode = S_IALLUGO;
-   err =
-   vfs_symlink(lower_dir_dentry-d_inode,
-   lower_dentry, symname, mode);
+   err = vfs_symlink(lower_dir_dentry-d_inode,
+ lower_dentry, symname, mode);
}
unlock_dir(lower_dir_dentry);
 
@@ -884,9 +884,11 @@ static void unionfs_put_link(struct dentry *dentry, struct 
nameidata *nd,
  *   readonly, to allow copyup to work.
  *   (3) we do call security_inode_permission, and therefore security inside
  *   SELinux, etc. are performed.
+ *
+ * @inode: the lower inode we're checking permission on
  */
-static int inode_permission(struct super_block *sb, struct inode *inode, int 
mask,
-   struct nameidata *nd, int bindex)
+static int inode_permission(struct super_block *sb, struct inode *inode,
+   int mask, struct nameidata *nd, int bindex)
 {
int retval, submask;
 
diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 4faae44..8595750 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -275,14 +275,14 @@ int __parse_branch_mode(const char *name)
  */
 int parse_branch_mode(const char *name)
 {
-   int perms =  __parse_branch_mode(name);
+   int perms = __parse_branch_mode(name);
 
if (perms == 0)
perms = MAY_READ | MAY_WRITE;
return perms;
 }
 
-/* 
+/*
  * parse the dirs= mount argument
  *
  * We don't need to lock the superblock private data's rwsem, as we get
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 5e9843b..755bc25 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -549,7 +549,7 @@ extern void __show_inode_counts(const struct inode *inode,
 
 /* we leave useful hooks for these check functions throughout the code */
 #define unionfs_check_inode(i) do { } while(0)
-#define unionfs_check_dentry(d)do { } while(0)
+#define unionfs_check_dentry(d)do { } while(0)
 #define unionfs_check_file(f)  do { } while(0)
 #define show_branch_counts(sb) do { } 

[PATCH 23/25] Unionfs: add un/likely conditionals on unlink ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/unlink.c |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c
index 3924f7f..33d08d9 100644
--- a/fs/unionfs/unlink.c
+++ b/fs/unionfs/unlink.c
@@ -26,13 +26,13 @@ static int unionfs_unlink_whiteout(struct inode *dir, 
struct dentry *dentry)
int bindex;
int err = 0;
 
-   if ((err = unionfs_partial_lookup(dentry)))
+   if (unlikely((err = unionfs_partial_lookup(dentry
goto out;
 
bindex = dbstart(dentry);
 
lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
-   if (!lower_dentry)
+   if (unlikely(!lower_dentry))
goto out;
 
lower_dir_dentry = lock_parent(lower_dentry);
@@ -42,13 +42,13 @@ static int unionfs_unlink_whiteout(struct inode *dir, 
struct dentry *dentry)
if (!(err = is_robranch_super(dentry-d_sb, bindex)))
err = vfs_unlink(lower_dir_dentry-d_inode, lower_dentry);
/* if vfs_unlink succeeded, update our inode's times */
-   if (!err)
+   if (likely(!err))
unionfs_copy_attr_times(dentry-d_inode);
dput(lower_dentry);
fsstack_copy_attr_times(dir, lower_dir_dentry-d_inode);
unlock_dir(lower_dir_dentry);
 
-   if (err  !IS_COPYUP_ERR(err))
+   if (unlikely(err  !IS_COPYUP_ERR(err)))
goto out;
 
if (err) {
@@ -62,11 +62,11 @@ static int unionfs_unlink_whiteout(struct inode *dir, 
struct dentry *dentry)
err = create_whiteout(dentry, dbstart(dentry));
 
 out:
-   if (!err)
+   if (likely(!err))
dentry-d_inode-i_nlink--;
 
/* We don't want to leave negative leftover dentries for revalidate. */
-   if (!err  (dbopaque(dentry) != -1))
+   if (likely(!err  (dbopaque(dentry) != -1)))
update_bstart(dentry);
 
return err;
@@ -79,7 +79,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
@@ -87,7 +87,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
 
err = unionfs_unlink_whiteout(dir, dentry);
/* call d_drop so the system forgets about us */
-   if (!err) {
+   if (likely(!err)) {
if (!S_ISDIR(dentry-d_inode-i_mode))
unionfs_postcopyup_release(dentry);
d_drop(dentry);
@@ -99,7 +99,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry)
}
 
 out:
-   if (!err) {
+   if (likely(!err)) {
unionfs_check_dentry(dentry);
unionfs_check_inode(dir);
}
@@ -117,7 +117,7 @@ static int unionfs_rmdir_first(struct inode *dir, struct 
dentry *dentry,
 
/* Here we need to remove whiteout entries. */
err = delete_whiteouts(dentry, dbstart(dentry), namelist);
-   if (err)
+   if (unlikely(err))
goto out;
 
lower_dentry = unionfs_lower_dentry(dentry);
@@ -135,7 +135,7 @@ static int unionfs_rmdir_first(struct inode *dir, struct 
dentry *dentry,
dentry-d_inode-i_nlink = unionfs_get_nlinks(dentry-d_inode);
 
 out:
-   if (lower_dir_dentry)
+   if (likely(lower_dir_dentry))
unlock_dir(lower_dir_dentry);
return err;
 }
@@ -148,7 +148,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
unionfs_read_lock(dentry-d_sb);
unionfs_lock_dentry(dentry);
 
-   if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) {
err = -ESTALE;
goto out;
}
@@ -156,7 +156,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
 
/* check if this unionfs directory is empty or not */
err = check_empty(dentry, namelist);
-   if (err)
+   if (unlikely(err))
goto out;
 
err = unionfs_rmdir_first(dir, dentry, namelist);
@@ -170,7 +170,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry)
goto out;
 
/* exit if the error returned was NOT -EROFS */
-   if (!IS_COPYUP_ERR(err))
+   if (unlikely(!IS_COPYUP_ERR(err)))
goto out;
 
new_err = create_whiteout(dentry, dbstart(dentry) - 1);
@@ -180,10 +180,10 @@ int unionfs_rmdir(struct inode *dir, struct dentry 
*dentry)
 
 out:
/* call d_drop so the system forgets about us */
-   if (!err)
+   if (likely(!err))
d_drop(dentry);
 
-   if (namelist)
+   if (likely(namelist))

[PATCH 15/25] Unionfs: add un/likely conditionals on fileops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/file.c |   38 +++---
 1 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c
index d8eaaa5..06ca1fa 100644
--- a/fs/unionfs/file.c
+++ b/fs/unionfs/file.c
@@ -24,13 +24,13 @@ static ssize_t unionfs_read(struct file *file, char __user 
*buf,
int err;
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, false)))
+   if (unlikely((err = unionfs_file_revalidate(file, false
goto out;
unionfs_check_file(file);
 
err = do_sync_read(file, buf, count, ppos);
 
-   if (err = 0)
+   if (likely(err = 0))
touch_atime(unionfs_lower_mnt(file-f_path.dentry),
unionfs_lower_dentry(file-f_path.dentry));
 
@@ -47,16 +47,16 @@ static ssize_t unionfs_aio_read(struct kiocb *iocb, const 
struct iovec *iov,
struct file *file = iocb-ki_filp;
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, false)))
+   if (unlikely((err = unionfs_file_revalidate(file, false
goto out;
unionfs_check_file(file);
 
err = generic_file_aio_read(iocb, iov, nr_segs, pos);
 
-   if (err == -EIOCBQUEUED)
+   if (unlikely(err == -EIOCBQUEUED))
err = wait_on_sync_kiocb(iocb);
 
-   if (err = 0)
+   if (likely(err = 0))
touch_atime(unionfs_lower_mnt(file-f_path.dentry),
unionfs_lower_dentry(file-f_path.dentry));
 
@@ -72,13 +72,13 @@ static ssize_t unionfs_write(struct file *file, const char 
__user *buf,
int err = 0;
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, true)))
+   if (unlikely((err = unionfs_file_revalidate(file, true
goto out;
unionfs_check_file(file);
 
err = do_sync_write(file, buf, count, ppos);
/* update our inode times upon a successful lower write */
-   if (err = 0) {
+   if (likely(err = 0)) {
unionfs_copy_attr_times(file-f_path.dentry-d_inode);
unionfs_check_file(file);
}
@@ -104,7 +104,7 @@ static int unionfs_mmap(struct file *file, struct 
vm_area_struct *vma)
 
/* This might be deferred to mmap's writepage */
willwrite = ((vma-vm_flags | VM_SHARED | VM_WRITE) == vma-vm_flags);
-   if ((err = unionfs_file_revalidate(file, willwrite)))
+   if (unlikely((err = unionfs_file_revalidate(file, willwrite
goto out;
unionfs_check_file(file);
 
@@ -119,19 +119,19 @@ static int unionfs_mmap(struct file *file, struct 
vm_area_struct *vma)
 * generic_file_readonly_mmap returns in that case).
 */
lower_file = unionfs_lower_file(file);
-   if (willwrite  !lower_file-f_mapping-a_ops-writepage) {
+   if (unlikely(willwrite  !lower_file-f_mapping-a_ops-writepage)) {
err = -EINVAL;
printk(unionfs: branch %d file system does not support 
   writeable mmap\n, fbstart(file));
} else {
err = generic_file_mmap(file, vma);
-   if (err)
+   if (unlikely(err))
printk(unionfs: generic_file_mmap failed %d\n, err);
}
 
 out:
unionfs_read_unlock(file-f_path.dentry-d_sb);
-   if (!err) {
+   if (likely(!err)) {
/* copyup could cause parent dir times to change */
unionfs_copy_attr_times(file-f_path.dentry-d_parent-d_inode);
unionfs_check_file(file);
@@ -149,7 +149,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, 
int datasync)
int err = -EINVAL;
 
unionfs_read_lock(file-f_path.dentry-d_sb);
-   if ((err = unionfs_file_revalidate(file, true)))
+   if (unlikely((err = unionfs_file_revalidate(file, true
goto out;
unionfs_check_file(file);
 
@@ -159,14 +159,14 @@ int unionfs_fsync(struct file *file, struct dentry 
*dentry, int datasync)
goto out;
 
inode = dentry-d_inode;
-   if (!inode) {
+   if (unlikely(!inode)) {
printk(KERN_ERR
   unionfs: null lower inode in unionfs_fsync\n);
goto out;
}
for (bindex = bstart; bindex = bend; bindex++) {
lower_inode = unionfs_lower_inode_idx(inode, bindex);
-   if (!lower_inode || !lower_inode-i_fop-fsync)
+   if (unlikely(!lower_inode || !lower_inode-i_fop-fsync))
continue;
lower_file = unionfs_lower_file_idx(file, bindex);
lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
@@ -175,7 +175,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, 
int datasync)
 

[PATCH 17/25] Unionfs: add un/likely conditionals on lookup ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/lookup.c |   44 ++--
 1 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c
index 2109714..92b5e0a 100644
--- a/fs/unionfs/lookup.c
+++ b/fs/unionfs/lookup.c
@@ -59,7 +59,7 @@ static noinline int is_opaque_dir(struct dentry *dentry, int 
bindex)
 
mutex_unlock(lower_inode-i_mutex);
 
-   if (IS_ERR(wh_lower_dentry)) {
+   if (unlikely(IS_ERR(wh_lower_dentry))) {
err = PTR_ERR(wh_lower_dentry);
goto out;
}
@@ -119,12 +119,12 @@ struct dentry *unionfs_lookup_backend(struct dentry 
*dentry,
case INTERPOSE_PARTIAL:
break;
case INTERPOSE_LOOKUP:
-   if ((err = new_dentry_private_data(dentry)))
+   if (unlikely((err = new_dentry_private_data(dentry
goto out;
break;
default:
/* default: can only be INTERPOSE_REVAL/REVAL_NEG */
-   if ((err = realloc_dentry_private_data(dentry)))
+   if (unlikely((err = realloc_dentry_private_data(dentry
goto out;
break;
}
@@ -147,7 +147,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
namelen = dentry-d_name.len;
 
/* No dentries should get created for possible whiteout names. */
-   if (!is_validname(name)) {
+   if (unlikely(!is_validname(name))) {
err = -EPERM;
goto out_free;
}
@@ -179,7 +179,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
unionfs_lower_dentry_idx(parent_dentry, bindex);
 
/* if the parent lower dentry does not exist skip this */
-   if (!(lower_dir_dentry  lower_dir_dentry-d_inode))
+   if (unlikely(!(lower_dir_dentry  lower_dir_dentry-d_inode)))
continue;
 
/* also skip it if the parent isn't a directory. */
@@ -189,7 +189,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
/* Reuse the whiteout name because its value doesn't change. */
if (!whname) {
whname = alloc_whname(name, namelen);
-   if (IS_ERR(whname)) {
+   if (unlikely(IS_ERR(whname))) {
err = PTR_ERR(whname);
goto out_free;
}
@@ -198,7 +198,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
/* check if whiteout exists in this branch: lookup .wh.foo */
wh_lower_dentry = lookup_one_len(whname, lower_dir_dentry,
 namelen + UNIONFS_WHLEN);
-   if (IS_ERR(wh_lower_dentry)) {
+   if (unlikely(IS_ERR(wh_lower_dentry))) {
dput(first_lower_dentry);
unionfs_mntput(first_dentry, first_dentry_offset);
err = PTR_ERR(wh_lower_dentry);
@@ -207,7 +207,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
 
if (wh_lower_dentry-d_inode) {
/* We found a whiteout so lets give up. */
-   if (S_ISREG(wh_lower_dentry-d_inode-i_mode)) {
+   if (likely(S_ISREG(wh_lower_dentry-d_inode-i_mode))) {
set_dbend(dentry, bindex);
set_dbopaque(dentry, bindex);
dput(wh_lower_dentry);
@@ -228,7 +228,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry,
 
/* Now do regular lookup; lookup foo */
lower_dentry = lookup_one_len(name, lower_dir_dentry, namelen);
-   if (IS_ERR(lower_dentry)) {
+   if (unlikely(IS_ERR(lower_dentry))) {
dput(first_lower_dentry);
unionfs_mntput(first_dentry, first_dentry_offset);
err = PTR_ERR(lower_dentry);
@@ -321,7 +321,7 @@ out_negative:
first_lower_dentry = lookup_one_len(name, lower_dir_dentry,
namelen);
first_dentry_offset = bindex;
-   if (IS_ERR(first_lower_dentry)) {
+   if (unlikely(IS_ERR(first_lower_dentry))) {
err = PTR_ERR(first_lower_dentry);
goto out;
}
@@ -381,12 +381,12 @@ out_positive:
 * dentry.
 */
d_interposed = unionfs_interpose(dentry, dentry-d_sb, lookupmode);
-   if (IS_ERR(d_interposed))
+   if (unlikely(IS_ERR(d_interposed)))
err = PTR_ERR(d_interposed);
else if (d_interposed)
dentry = d_interposed;
 
-   if (err)
+   if 

[PATCH 13/25] Unionfs: add un/likely conditionals on dir ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/dirfops.c   |   22 +++---
 fs/unionfs/dirhelper.c |   30 +++---
 2 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c
index c923e58..fa2df88 100644
--- a/fs/unionfs/dirfops.c
+++ b/fs/unionfs/dirfops.c
@@ -63,7 +63,7 @@ static int unionfs_filldir(void *dirent, const char *name, 
int namelen,
off_t pos = rdstate2offset(buf-rdstate);
u64 unionfs_ino = ino;
 
-   if (!err) {
+   if (likely(!err)) {
err = buf-filldir(buf-dirent, name, namelen, pos,
   unionfs_ino, d_type);
buf-rdstate-offset++;
@@ -74,7 +74,7 @@ static int unionfs_filldir(void *dirent, const char *name, 
int namelen,
 * If we did fill it, stuff it in our hash, otherwise return an
 * error.
 */
-   if (err) {
+   if (unlikely(err)) {
buf-filldir_error = err;
goto out;
}
@@ -99,7 +99,7 @@ static int unionfs_readdir(struct file *file, void *dirent, 
filldir_t filldir)
 
unionfs_read_lock(file-f_path.dentry-d_sb);
 
-   if ((err = unionfs_file_revalidate(file, false)))
+   if (unlikely((err = unionfs_file_revalidate(file, false
goto out;
 
inode = file-f_path.dentry-d_inode;
@@ -110,7 +110,7 @@ static int unionfs_readdir(struct file *file, void *dirent, 
filldir_t filldir)
goto out;
} else if (file-f_pos  0) {
uds = find_rdstate(inode, file-f_pos);
-   if (!uds) {
+   if (unlikely(!uds)) {
err = -ESTALE;
goto out;
}
@@ -124,7 +124,7 @@ static int unionfs_readdir(struct file *file, void *dirent, 
filldir_t filldir)
 
while (uds-bindex = bend) {
lower_file = unionfs_lower_file_idx(file, uds-bindex);
-   if (!lower_file) {
+   if (unlikely(!lower_file)) {
uds-bindex++;
uds-dirpos = 0;
continue;
@@ -141,7 +141,7 @@ static int unionfs_readdir(struct file *file, void *dirent, 
filldir_t filldir)
 
/* Read starting from where we last left off. */
offset = vfs_llseek(lower_file, uds-dirpos, SEEK_SET);
-   if (offset  0) {
+   if (unlikely(offset  0)) {
err = offset;
goto out;
}
@@ -149,7 +149,7 @@ static int unionfs_readdir(struct file *file, void *dirent, 
filldir_t filldir)
 
/* Save the position for when we continue. */
offset = vfs_llseek(lower_file, 0, SEEK_CUR);
-   if (offset  0) {
+   if (unlikely(offset  0)) {
err = offset;
goto out;
}
@@ -158,10 +158,10 @@ static int unionfs_readdir(struct file *file, void 
*dirent, filldir_t filldir)
/* Copy the atime. */
fsstack_copy_attr_atime(inode, 
lower_file-f_path.dentry-d_inode);
 
-   if (err  0)
+   if (unlikely(err  0))
goto out;
 
-   if (buf.filldir_error)
+   if (unlikely(buf.filldir_error))
break;
 
if (!buf.entries_written) {
@@ -201,7 +201,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t 
offset, int origin)
 
unionfs_read_lock(file-f_path.dentry-d_sb);
 
-   if ((err = unionfs_file_revalidate(file, false)))
+   if (unlikely((err = unionfs_file_revalidate(file, false
goto out;
 
rdstate = UNIONFS_F(file)-rdstate;
@@ -241,7 +241,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t 
offset, int origin)
} else {
rdstate = 
find_rdstate(file-f_path.dentry-d_inode,
   offset);
-   if (rdstate) {
+   if (likely(rdstate)) {
UNIONFS_F(file)-rdstate = rdstate;
err = rdstate-offset;
} else
diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c
index a72f711..d481ba4 100644
--- a/fs/unionfs/dirhelper.c
+++ b/fs/unionfs/dirhelper.c
@@ -43,7 +43,7 @@ int do_delete_whiteouts(struct dentry *dentry, int bindex,
 
err = -ENOMEM;
name = __getname();
-   if (!name)
+   if (unlikely(!name))
goto out;
strcpy(name, UNIONFS_WHPFX);
p = name + UNIONFS_WHLEN;
@@ -65,14 +65,14 @@ int do_delete_whiteouts(struct dentry *dentry, 

[PATCH 12/25] Unionfs: add un/likely conditionals on dentry ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/dentry.c |   68 ++
 1 files changed, 35 insertions(+), 33 deletions(-)

diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c
index 52bcb18..3f3a18d 100644
--- a/fs/unionfs/dentry.c
+++ b/fs/unionfs/dentry.c
@@ -45,7 +45,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
verify_locked(dentry);
 
/* if the dentry is unhashed, do NOT revalidate */
-   if (d_deleted(dentry)) {
+   if (unlikely(d_deleted(dentry))) {
dprintk(KERN_DEBUG unionfs: unhashed dentry being 
revalidated: %*s\n,
dentry-d_name.len, dentry-d_name.name);
@@ -53,7 +53,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
}
 
BUG_ON(dbstart(dentry) == -1);
-   if (dentry-d_inode)
+   if (likely(dentry-d_inode))
positive = 1;
dgen = atomic_read(UNIONFS_D(dentry)-generation);
sbgen = atomic_read(UNIONFS_SB(dentry-d_sb)-generation);
@@ -62,7 +62,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
 * revalidation to be done, because this file does not exist within
 * the namespace, and Unionfs operates on the namespace, not data.
 */
-   if (sbgen != dgen) {
+   if (unlikely(sbgen != dgen)) {
struct dentry *result;
int pdgen;
 
@@ -76,7 +76,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
/* Free the pointers for our inodes and this dentry. */
bstart = dbstart(dentry);
bend = dbend(dentry);
-   if (bstart = 0) {
+   if (likely(bstart = 0)) {
struct dentry *lower_dentry;
for (bindex = bstart; bindex = bend; bindex++) {
lower_dentry =
@@ -89,7 +89,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry,
set_dbend(dentry, -1);
 
interpose_flag = INTERPOSE_REVAL_NEG;
-   if (positive) {
+   if (likely(positive)) {
interpose_flag = INTERPOSE_REVAL;
/*
 * During BRM, the VFS could already hold a lock on
@@ -97,14 +97,14 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
 * (deadlock), but if you lock it in this function,
 * then release it here too.
 */
-   if (!mutex_is_locked(dentry-d_inode-i_mutex)) {
+   if 
(unlikely(!mutex_is_locked(dentry-d_inode-i_mutex))) {
mutex_lock(dentry-d_inode-i_mutex);
locked = 1;
}
 
bstart = ibstart(dentry-d_inode);
bend = ibend(dentry-d_inode);
-   if (bstart = 0) {
+   if (likely(bstart = 0)) {
struct inode *lower_inode;
for (bindex = bstart; bindex = bend;
 bindex++) {
@@ -119,14 +119,14 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
UNIONFS_I(dentry-d_inode)-lower_inodes = NULL;
ibstart(dentry-d_inode) = -1;
ibend(dentry-d_inode) = -1;
-   if (locked)
+   if (unlikely(locked))
mutex_unlock(dentry-d_inode-i_mutex);
}
 
result = unionfs_lookup_backend(dentry, lowernd,
interpose_flag);
-   if (result) {
-   if (IS_ERR(result)) {
+   if (likely(result)) {
+   if (unlikely(IS_ERR(result))) {
valid = false;
goto out;
}
@@ -138,7 +138,7 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
dentry = result;
}
 
-   if (positive  UNIONFS_I(dentry-d_inode)-stale) {
+   if (unlikely(positive  UNIONFS_I(dentry-d_inode)-stale)) {
make_bad_inode(dentry-d_inode);
d_drop(dentry);
valid = false;
@@ -153,8 +153,8 @@ static bool __unionfs_d_revalidate_one(struct dentry 
*dentry,
BUG_ON(bstart == -1);
for (bindex = bstart; bindex = bend; bindex++) {
lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
-   if (!lower_dentry || !lower_dentry-d_op
-   || !lower_dentry-d_op-d_revalidate)
+   if (unlikely(!lower_dentry || !lower_dentry-d_op
+|| 

[PATCH 20/25] Unionfs: add un/likely conditionals on rename ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/rename.c |   96 +-
 1 files changed, 48 insertions(+), 48 deletions(-)

diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c
index 7b8fe39..92c4515 100644
--- a/fs/unionfs/rename.c
+++ b/fs/unionfs/rename.c
@@ -39,7 +39,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
create_parents(new_dentry-d_parent-d_inode,
   new_dentry, new_dentry-d_name.name,
   bindex);
-   if (IS_ERR(lower_new_dentry)) {
+   if (unlikely(IS_ERR(lower_new_dentry))) {
printk(KERN_DEBUG unionfs: error creating directory 
   tree for rename, bindex = %d, err = %ld\n,
   bindex, PTR_ERR(lower_new_dentry));
@@ -50,7 +50,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
 
wh_name = alloc_whname(new_dentry-d_name.name,
   new_dentry-d_name.len);
-   if (IS_ERR(wh_name)) {
+   if (unlikely(IS_ERR(wh_name))) {
err = PTR_ERR(wh_name);
goto out;
}
@@ -58,14 +58,14 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
lower_wh_dentry = lookup_one_len(wh_name, lower_new_dentry-d_parent,
 new_dentry-d_name.len +
 UNIONFS_WHLEN);
-   if (IS_ERR(lower_wh_dentry)) {
+   if (unlikely(IS_ERR(lower_wh_dentry))) {
err = PTR_ERR(lower_wh_dentry);
goto out;
}
 
if (lower_wh_dentry-d_inode) {
/* get rid of the whiteout that is existing */
-   if (lower_new_dentry-d_inode) {
+   if (unlikely(lower_new_dentry-d_inode)) {
printk(KERN_WARNING unionfs: both a whiteout and a 
   dentry exist when doing a rename!\n);
err = -EIO;
@@ -81,7 +81,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
 
dput(lower_wh_dentry);
unlock_dir(lower_wh_dir_dentry);
-   if (err)
+   if (unlikely(err))
goto out;
} else
dput(lower_wh_dentry);
@@ -93,7 +93,7 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
lock_rename(lower_old_dir_dentry, lower_new_dir_dentry);
 
err = is_robranch_super(old_dentry-d_sb, bindex);
-   if (err)
+   if (unlikely(err))
goto out_unlock;
 
/*
@@ -105,14 +105,14 @@ static int __unionfs_rename(struct inode *old_dir, struct 
dentry *old_dentry,
whname = alloc_whname(old_dentry-d_name.name,
  old_dentry-d_name.len);
err = PTR_ERR(whname);
-   if (IS_ERR(whname))
+   if (unlikely(IS_ERR(whname)))
goto out_unlock;
*wh_old = lookup_one_len(whname, lower_old_dir_dentry,
 old_dentry-d_name.len +
 UNIONFS_WHLEN);
kfree(whname);
err = PTR_ERR(*wh_old);
-   if (IS_ERR(*wh_old)) {
+   if (unlikely(IS_ERR(*wh_old))) {
*wh_old = NULL;
goto out_unlock;
}
@@ -129,7 +129,7 @@ out_unlock:
dput(lower_old_dentry);
 
 out:
-   if (!err) {
+   if (likely(!err)) {
/* Fixup the new_dentry. */
if (bindex  dbstart(new_dentry))
set_dbstart(new_dentry, bindex);
@@ -174,8 +174,8 @@ static int do_unionfs_rename(struct inode *old_dir,
/* Rename source to destination. */
err = __unionfs_rename(old_dir, old_dentry, new_dir, new_dentry,
   old_bstart, wh_old);
-   if (err) {
-   if (!IS_COPYUP_ERR(err))
+   if (unlikely(err)) {
+   if (unlikely(!IS_COPYUP_ERR(err)))
goto out;
do_copyup = old_bstart - 1;
} else
@@ -190,7 +190,7 @@ static int do_unionfs_rename(struct inode *old_dir,
struct dentry *unlink_dir_dentry;
 
unlink_dentry = unionfs_lower_dentry_idx(new_dentry, bindex);
-   if (!unlink_dentry)
+   if (unlikely(!unlink_dentry))
continue;
 
unlink_dir_dentry = lock_parent(unlink_dentry);
@@ -205,15 +205,15 @@ static int do_unionfs_rename(struct inode *old_dir,
unionfs_get_nlinks(new_dentry-d_parent-d_inode);
 
unlock_dir(unlink_dir_dentry);
-   if (!err) {
+  

[PATCH 09/25] Unionfs: add un/likely conditionals on common fileops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/commonfops.c |   94 +++---
 1 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c
index e69ccf6..db8f064 100644
--- a/fs/unionfs/commonfops.c
+++ b/fs/unionfs/commonfops.c
@@ -64,7 +64,7 @@ retry:
 
tmp_dentry = lookup_one_len(name, lower_dentry-d_parent,
nlen);
-   if (IS_ERR(tmp_dentry)) {
+   if (unlikely(IS_ERR(tmp_dentry))) {
err = PTR_ERR(tmp_dentry);
goto out;
}
@@ -73,8 +73,8 @@ retry:
 
err = copyup_named_file(dentry-d_parent-d_inode, file, name, bstart,
bindex, file-f_path.dentry-d_inode-i_size);
-   if (err) {
-   if (err == -EEXIST)
+   if (unlikely(err)) {
+   if (unlikely(err == -EEXIST))
goto retry;
goto out;
}
@@ -91,7 +91,7 @@ retry:
unlock_dir(lower_dir_dentry);
 
 out:
-   if (!err)
+   if (likely(!err))
unionfs_check_dentry(dentry);
return err;
 }
@@ -126,7 +126,7 @@ static void cleanup_file(struct file *file)
 */
old_bid = UNIONFS_F(file)-saved_branch_ids[bindex];
i = branch_id_to_idx(sb, old_bid);
-   if (i  0) {
+   if (unlikely(i  0)) {
printk(KERN_ERR unionfs: no superblock for 
   file %p\n, file);
continue;
@@ -179,7 +179,7 @@ static int open_all_files(struct file *file)
dentry_open(lower_dentry,
unionfs_lower_mnt_idx(dentry, bindex),
file-f_flags);
-   if (IS_ERR(lower_file)) {
+   if (unlikely(IS_ERR(lower_file))) {
err = PTR_ERR(lower_file);
goto out;
} else
@@ -208,7 +208,7 @@ static int open_highest_file(struct file *file, bool 
willwrite)
for (bindex = bstart - 1; bindex = 0; bindex--) {
err = copyup_file(parent_inode, file, bstart, bindex,
  inode_size);
-   if (!err)
+   if (likely(!err))
break;
}
atomic_set(UNIONFS_F(file)-generation,
@@ -222,7 +222,7 @@ static int open_highest_file(struct file *file, bool 
willwrite)
lower_file = dentry_open(lower_dentry,
 unionfs_lower_mnt_idx(dentry, bstart),
 file-f_flags);
-   if (IS_ERR(lower_file)) {
+   if (unlikely(IS_ERR(lower_file))) {
err = PTR_ERR(lower_file);
goto out;
}
@@ -252,17 +252,17 @@ static int do_delayed_copyup(struct file *file)
unionfs_check_file(file);
unionfs_check_dentry(dentry);
for (bindex = bstart - 1; bindex = 0; bindex--) {
-   if (!d_deleted(dentry))
+   if (likely(!d_deleted(dentry)))
err = copyup_file(parent_inode, file, bstart,
  bindex, inode_size);
else
err = copyup_deleted_file(file, dentry, bstart,
  bindex);
 
-   if (!err)
+   if (likely(!err))
break;
}
-   if (err || (bstart = fbstart(file)))
+   if (unlikely(err || (bstart = fbstart(file
goto out;
bend = fbend(file);
for (bindex = bstart; bindex = bend; bindex++) {
@@ -317,8 +317,8 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
 * First revalidate the dentry inside struct file,
 * but not unhashed dentries.
 */
-   if (!d_deleted(dentry) 
-   !__unionfs_d_revalidate_chain(dentry, NULL, willwrite)) {
+   if (unlikely(!d_deleted(dentry) 
+!__unionfs_d_revalidate_chain(dentry, NULL, willwrite))) {
err = -ESTALE;
goto out_nofree;
}
@@ -335,8 +335,8 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
 * someone has copied up this file from underneath us, we also need
 * to refresh things.
 */
-   if (!d_deleted(dentry) 
-   (sbgen  fgen || dbstart(dentry) != fbstart(file))) {
+   if (unlikely(!d_deleted(dentry) 
+(sbgen  fgen || dbstart(dentry) != fbstart(file {
/* save orig branch ID */
int orig_brid = 
UNIONFS_F(file)-saved_branch_ids[fbstart(file)];
 
@@ -349,13 +349,13 @@ int unionfs_file_revalidate(struct file *file, bool 
willwrite)
 
  

[PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/copyup.c |  102 +-
 1 files changed, 51 insertions(+), 51 deletions(-)

diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
index 23ac4c8..e3c5f15 100644
--- a/fs/unionfs/copyup.c
+++ b/fs/unionfs/copyup.c
@@ -36,14 +36,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
 
/* query the actual size of the xattr list */
list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
-   if (list_size = 0) {
+   if (unlikely(list_size = 0)) {
err = list_size;
goto out;
}
 
/* allocate space for the actual list */
name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX);
-   if (!name_list || IS_ERR(name_list)) {
+   if (unlikely(!name_list || IS_ERR(name_list))) {
err = PTR_ERR(name_list);
goto out;
}
@@ -52,14 +52,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
 
/* now get the actual xattr list of the source file */
list_size = vfs_listxattr(old_lower_dentry, name_list, list_size);
-   if (list_size = 0) {
+   if (unlikely(list_size = 0)) {
err = list_size;
goto out;
}
 
/* allocate space to hold each xattr's value */
attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX);
-   if (!attr_value || IS_ERR(attr_value)) {
+   if (unlikely(!attr_value || IS_ERR(attr_value))) {
err = PTR_ERR(name_list);
goto out;
}
@@ -73,11 +73,11 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
size = vfs_getxattr(old_lower_dentry, name_list,
attr_value, XATTR_SIZE_MAX);
mutex_unlock(old_lower_dentry-d_inode-i_mutex);
-   if (size  0) {
+   if (unlikely(size  0)) {
err = size;
goto out;
}
-   if (size  XATTR_SIZE_MAX) {
+   if (unlikely(size  XATTR_SIZE_MAX)) {
err = -E2BIG;
goto out;
}
@@ -91,13 +91,13 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
 * temporarily get FOWNER privileges.
 * XXX: move entire copyup code to SIOQ.
 */
-   if (err == -EPERM  !capable(CAP_FOWNER)) {
+   if (unlikely(err == -EPERM  !capable(CAP_FOWNER))) {
cap_raise(current-cap_effective, CAP_FOWNER);
err = vfs_setxattr(new_lower_dentry, name_list,
   attr_value, size, 0);
cap_lower(current-cap_effective, CAP_FOWNER);
}
-   if (err  0)
+   if (unlikely(err  0))
goto out;
name_list += strlen(name_list) + 1;
}
@@ -105,7 +105,7 @@ out:
unionfs_xattr_kfree(name_list_buf);
unionfs_xattr_kfree(attr_value);
/* Ignore if xattr isn't supported */
-   if (err == -ENOTSUPP || err == -EOPNOTSUPP)
+   if (unlikely(err == -ENOTSUPP || err == -EOPNOTSUPP))
err = 0;
return err;
 }
@@ -136,15 +136,15 @@ static int copyup_permissions(struct super_block *sb,
ATTR_ATIME_SET | ATTR_MTIME_SET | ATTR_FORCE |
ATTR_GID | ATTR_UID;
err = notify_change(new_lower_dentry, newattrs);
-   if (err)
+   if (unlikely(err))
goto out;
 
/* now try to change the mode and ignore EOPNOTSUPP on symlinks */
newattrs.ia_mode = i-i_mode;
newattrs.ia_valid = ATTR_MODE | ATTR_FORCE;
err = notify_change(new_lower_dentry, newattrs);
-   if (err == -EOPNOTSUPP 
-   S_ISLNK(new_lower_dentry-d_inode-i_mode)) {
+   if (unlikely(err == -EOPNOTSUPP 
+S_ISLNK(new_lower_dentry-d_inode-i_mode))) {
printk(KERN_WARNING
   unionfs: changing \%s\ symlink mode unsupported\n,
   new_lower_dentry-d_name.name);
@@ -178,7 +178,7 @@ static int __copyup_ndentry(struct dentry *old_lower_dentry,
 
run_sioq(__unionfs_mkdir, args);
err = args.err;
-   } else if (S_ISLNK(old_mode)) {
+   } else if (unlikely(S_ISLNK(old_mode))) {
args.symlink.parent = new_lower_parent_dentry-d_inode;
args.symlink.dentry = new_lower_dentry;
args.symlink.symbuf = symbuf;
@@ -186,8 +186,8 @@ static int __copyup_ndentry(struct dentry *old_lower_dentry,
 
run_sioq(__unionfs_symlink, args);
err = args.err;
-   } else if (S_ISBLK(old_mode) || S_ISCHR(old_mode) ||
-  S_ISFIFO(old_mode) || S_ISSOCK(old_mode)) {
+   } else if 

[PATCH 16/25] Unionfs: add un/likely conditionals on inode ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/inode.c |  160 ++--
 1 files changed, 80 insertions(+), 80 deletions(-)

diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c
index 7ee4760..7ae4a25 100644
--- a/fs/unionfs/inode.c
+++ b/fs/unionfs/inode.c
@@ -35,7 +35,7 @@ static int unionfs_create(struct inode *parent, struct dentry 
*dentry,
unionfs_lock_dentry(dentry-d_parent);
valid = __unionfs_d_revalidate_chain(dentry-d_parent, nd, false);
unionfs_unlock_dentry(dentry-d_parent);
-   if (!valid) {
+   if (unlikely(!valid)) {
err = -ESTALE;  /* same as what real_lookup does */
goto out;
}
@@ -60,26 +60,26 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
 * We _always_ create on branch 0
 */
lower_dentry = unionfs_lower_dentry_idx(dentry, 0);
-   if (lower_dentry) {
+   if (likely(lower_dentry)) {
/*
 * check if whiteout exists in this branch, i.e. lookup .wh.foo
 * first.
 */
name = alloc_whname(dentry-d_name.name, dentry-d_name.len);
-   if (IS_ERR(name)) {
+   if (unlikely(IS_ERR(name))) {
err = PTR_ERR(name);
goto out;
}
 
wh_dentry = lookup_one_len(name, lower_dentry-d_parent,
   dentry-d_name.len + UNIONFS_WHLEN);
-   if (IS_ERR(wh_dentry)) {
+   if (unlikely(IS_ERR(wh_dentry))) {
err = PTR_ERR(wh_dentry);
wh_dentry = NULL;
goto out;
}
 
-   if (wh_dentry-d_inode) {
+   if (unlikely(wh_dentry-d_inode)) {
/*
 * .wh.foo has been found, so let's unlink it
 */
@@ -89,7 +89,7 @@ static int unionfs_create(struct inode *parent, struct dentry 
*dentry,
err = vfs_unlink(lower_dir_dentry-d_inode, wh_dentry);
unlock_dir(lower_dir_dentry);
 
-   if (err) {
+   if (unlikely(err)) {
printk(unionfs_create: could not unlink 
   whiteout, err = %d\n, err);
goto out;
@@ -102,28 +102,28 @@ static int unionfs_create(struct inode *parent, struct 
dentry *dentry,
 */
lower_dentry = create_parents(parent, dentry,
  dentry-d_name.name, 0);
-   if (IS_ERR(lower_dentry)) {
+   if (unlikely(IS_ERR(lower_dentry))) {
err = PTR_ERR(lower_dentry);
goto out;
}
}
 
lower_parent_dentry = lock_parent(lower_dentry);
-   if (IS_ERR(lower_parent_dentry)) {
+   if (unlikely(IS_ERR(lower_parent_dentry))) {
err = PTR_ERR(lower_parent_dentry);
goto out;
}
 
err = init_lower_nd(lower_nd, LOOKUP_CREATE);
-   if (err  0)
+   if (unlikely(err  0))
goto out;
err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode,
 lower_nd);
release_lower_nd(lower_nd, err);
 
-   if (!err) {
+   if (likely(!err)) {
err = PTR_ERR(unionfs_interpose(dentry, parent-i_sb, 0));
-   if (!err) {
+   if (likely(!err)) {
unionfs_copy_attr_times(parent);
fsstack_copy_inode_size(parent,
lower_parent_dentry-d_inode);
@@ -138,13 +138,13 @@ out:
dput(wh_dentry);
kfree(name);
 
-   if (!err)
+   if (likely(!err))
unionfs_postcopyup_setmnt(dentry);
unionfs_unlock_dentry(dentry);
unionfs_read_unlock(dentry-d_sb);
 
unionfs_check_inode(parent);
-   if (!err) {
+   if (likely(!err)) {
unionfs_check_dentry(dentry-d_parent);
unionfs_check_nd(nd);
}
@@ -183,7 +183,7 @@ static struct dentry *unionfs_lookup(struct inode *parent,
nd-dentry = path_save.dentry;
nd-mnt = path_save.mnt;
}
-   if (!IS_ERR(ret)) {
+   if (likely(!IS_ERR(ret))) {
if (ret)
dentry = ret;
/* parent times may have changed */
@@ -213,12 +213,12 @@ static int unionfs_link(struct dentry *old_dentry, struct 
inode *dir,
unionfs_read_lock(old_dentry-d_sb);
unionfs_double_lock_dentry(new_dentry, old_dentry);
 
-   if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) {
+   if (unlikely(!__unionfs_d_revalidate_chain(old_dentry, NULL, 

[PATCH 18/25] Unionfs: add un/likely conditionals on super ops

2007-09-25 Thread Erez Zadok
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/main.c  |   98 ++-
 fs/unionfs/super.c |   90 
 2 files changed, 95 insertions(+), 93 deletions(-)

diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c
index 8595750..82cb35a 100644
--- a/fs/unionfs/main.c
+++ b/fs/unionfs/main.c
@@ -32,13 +32,13 @@ static void unionfs_fill_inode(struct dentry *dentry,
 
for (bindex = bstart; bindex = bend; bindex++) {
lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
-   if (!lower_dentry) {
+   if (unlikely(!lower_dentry)) {
unionfs_set_lower_inode_idx(inode, bindex, NULL);
continue;
}
 
/* Initialize the lower inode to the new lower inode. */
-   if (!lower_dentry-d_inode)
+   if (unlikely(!lower_dentry-d_inode))
continue;
 
unionfs_set_lower_inode_idx(inode, bindex,
@@ -52,7 +52,7 @@ static void unionfs_fill_inode(struct dentry *dentry,
lower_inode = unionfs_lower_inode(inode);
 
/* Use different set of inode ops for symlinks  directories */
-   if (S_ISLNK(lower_inode-i_mode))
+   if (unlikely(S_ISLNK(lower_inode-i_mode)))
inode-i_op = unionfs_symlink_iops;
else if (S_ISDIR(lower_inode-i_mode))
inode-i_op = unionfs_dir_iops;
@@ -62,8 +62,10 @@ static void unionfs_fill_inode(struct dentry *dentry,
inode-i_fop = unionfs_dir_fops;
 
/* properly initialize special inodes */
-   if (S_ISBLK(lower_inode-i_mode) || S_ISCHR(lower_inode-i_mode) ||
-   S_ISFIFO(lower_inode-i_mode) || S_ISSOCK(lower_inode-i_mode))
+   if (unlikely(S_ISBLK(lower_inode-i_mode) ||
+S_ISCHR(lower_inode-i_mode) ||
+S_ISFIFO(lower_inode-i_mode) ||
+S_ISSOCK(lower_inode-i_mode)))
init_special_inode(inode, lower_inode-i_mode,
   lower_inode-i_rdev);
 
@@ -122,14 +124,14 @@ struct dentry *unionfs_interpose(struct dentry *dentry, 
struct super_block *sb,
 
UNIONFS_I(inode)-lower_inodes =
kcalloc(sbmax(sb), sizeof(struct inode *), GFP_KERNEL);
-   if (!UNIONFS_I(inode)-lower_inodes) {
+   if (unlikely(!UNIONFS_I(inode)-lower_inodes)) {
err = -ENOMEM;
goto out;
}
} else {
/* get unique inode number for unionfs */
inode = iget(sb, iunique(sb, UNIONFS_ROOT_INO));
-   if (!inode) {
+   if (unlikely(!inode)) {
err = -EACCES;
goto out;
}
@@ -149,7 +151,7 @@ skip:
break;
case INTERPOSE_LOOKUP:
spliced = d_splice_alias(inode, dentry);
-   if (IS_ERR(spliced))
+   if (unlikely(IS_ERR(spliced)))
err = PTR_ERR(spliced);
else if (spliced  spliced != dentry) {
/*
@@ -181,7 +183,7 @@ skip:
goto out;
 
 out_spliced:
-   if (!err)
+   if (likely(!err))
return spliced;
 out:
return ERR_PTR(err);
@@ -203,12 +205,12 @@ void unionfs_reinterpose(struct dentry *dentry)
bend = dbend(dentry);
for (bindex = bstart; bindex = bend; bindex++) {
lower_dentry = unionfs_lower_dentry_idx(dentry, bindex);
-   if (!lower_dentry)
+   if (unlikely(!lower_dentry))
continue;
 
-   if (!lower_dentry-d_inode)
+   if (unlikely(!lower_dentry-d_inode))
continue;
-   if (unionfs_lower_inode_idx(inode, bindex))
+   if (unlikely(unionfs_lower_inode_idx(inode, bindex)))
continue;
unionfs_set_lower_inode_idx(inode, bindex,
igrab(lower_dentry-d_inode));
@@ -227,11 +229,11 @@ void unionfs_reinterpose(struct dentry *dentry)
 int check_branch(struct nameidata *nd)
 {
/* XXX: remove in ODF code -- stacking unions allowed there */
-   if (!strcmp(nd-dentry-d_sb-s_type-name, unionfs))
+   if (unlikely(!strcmp(nd-dentry-d_sb-s_type-name, unionfs)))
return -EINVAL;
-   if (!nd-dentry-d_inode)
+   if (unlikely(!nd-dentry-d_inode))
return -ENOENT;
-   if (!S_ISDIR(nd-dentry-d_inode-i_mode))
+   if (unlikely(!S_ISDIR(nd-dentry-d_inode-i_mode)))
return -ENOTDIR;
return 0;
 }
@@ -245,7 +247,7 @@ static int is_branch_overlap(struct dentry *dent1, struct 
dentry *dent2)
while ((dent != dent2)  (dent-d_parent != dent))
dent = dent-d_parent;
 
-   if 

[PATCH 02/25] Unionfs: Remove unused #defines

2007-09-25 Thread Erez Zadok
From: Josef 'Jeff' Sipek [EMAIL PROTECTED]

Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/union.h |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 1cb2e1d..140b8ae 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -437,10 +437,6 @@ static inline int is_robranch(const struct dentry *dentry)
 #define UNIONFS_DIR_OPAQUE_NAME __dir_opaque
 #define UNIONFS_DIR_OPAQUE UNIONFS_WHPFX UNIONFS_DIR_OPAQUE_NAME
 
-#ifndef DEFAULT_POLLMASK
-#define DEFAULT_POLLMASK (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM)
-#endif /* not DEFAULT_POLLMASK */
-
 /*
  * EXTERNALS:
  */
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 04/25] Unionfs: cache-coherency fixes

2007-09-25 Thread Erez Zadok
From: Olivier Blin [EMAIL PROTECTED]

Do not update mtime if there is no upper branch for the inode.  This
prevents from calling unionfs_lower_inode_idx() with a negative index, which
triggers a bug.

Signed-off-by: Olivier Blin [EMAIL PROTECTED]
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/fanout.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h
index afeb9f6..51aa0de 100644
--- a/fs/unionfs/fanout.h
+++ b/fs/unionfs/fanout.h
@@ -308,7 +308,7 @@ static inline void unionfs_copy_attr_times(struct inode 
*upper)
int bindex;
struct inode *lower;
 
-   if (!upper)
+   if (!upper || ibstart(upper)  0)
return;
for (bindex=ibstart(upper); bindex = ibend(upper); bindex++) {
lower = unionfs_lower_inode_idx(upper, bindex);
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/25] Unionfs: Simplify unionfs_get_nlinks

2007-09-25 Thread Erez Zadok
From: Josef 'Jeff' Sipek [EMAIL PROTECTED]

Since we set the right value for d_type in readdir, there's really no point
in having to calculate the number of directory links.  Some on-disk
filesystems don't even store the number of links for directories.

Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/subr.c |   41 +++--
 1 files changed, 7 insertions(+), 34 deletions(-)

diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c
index b7e7904..6b93b64 100644
--- a/fs/unionfs/subr.c
+++ b/fs/unionfs/subr.c
@@ -188,16 +188,10 @@ out:
 }
 
 /*
- * returns the sum of the n_link values of all the underlying inodes of the
- * passed inode
+ * returns the right n_link value based on the inode type
  */
 int unionfs_get_nlinks(const struct inode *inode)
 {
-   int sum_nlinks = 0;
-   int dirs = 0;
-   int bindex;
-   struct inode *lower_inode;
-
/* don't bother to do all the work since we're unlinked */
if (inode-i_nlink == 0)
return 0;
@@ -205,33 +199,12 @@ int unionfs_get_nlinks(const struct inode *inode)
if (!S_ISDIR(inode-i_mode))
return unionfs_lower_inode(inode)-i_nlink;
 
-   for (bindex = ibstart(inode); bindex = ibend(inode); bindex++) {
-   lower_inode = unionfs_lower_inode_idx(inode, bindex);
-
-   /* ignore files */
-   if (!lower_inode || !S_ISDIR(lower_inode-i_mode))
-   continue;
-
-   BUG_ON(lower_inode-i_nlink  0);
-
-   /* A deleted directory. */
-   if (lower_inode-i_nlink == 0)
-   continue;
-   dirs++;
-
-   /*
-* A broken directory...
-*
-* Some filesystems don't properly set the number of links
-* on empty directories
-*/
-   if (lower_inode-i_nlink == 1)
-   sum_nlinks += 2;
-   else
-   sum_nlinks += (lower_inode-i_nlink - 2);
-   }
-
-   return (!dirs ? 0 : sum_nlinks + 2);
+   /*
+* For directories, we return 1. The only place that could cares
+* about links is readdir, and there's d_type there so even that
+* doesn't matter.
+*/
+   return 1;
 }
 
 /* construct whiteout filename */
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 25/25] Unionfs: use poison.h for safe poison pointers

2007-09-25 Thread Erez Zadok
This also fixes a compile warning on 64-bit systems.

Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED]
Signed-off-by: Erez Zadok [EMAIL PROTECTED]
---
 fs/unionfs/debug.c |   16 ++--
 fs/unionfs/union.h |1 +
 2 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c
index 09b52ce..b103eb9 100644
--- a/fs/unionfs/debug.c
+++ b/fs/unionfs/debug.c
@@ -25,14 +25,6 @@
}   \
} while (0)
 
-#if BITS_PER_LONG == 32
-#define POISONED_PTR   ((void*) 0x5a5a5a5a)
-#elif BITS_PER_LONG == 64
-#define POISONED_PTR   ((void*) 0x5a5a5a5a5a5a5a5a)
-#else
-#error Unknown BITS_PER_LONG value
-#endif /* BITS_PER_LONG != known */
-
 /*
  * __unionfs_check_{inode,dentry,file} perform exhaustive sanity checking on
  * the fan-out of various Unionfs objects.  We check that no lower objects
@@ -50,6 +42,7 @@ void __unionfs_check_inode(const struct inode *inode,
struct inode *lower_inode;
struct super_block *sb;
int printed_caller = 0;
+   void *poison_ptr;
 
/* for inodes now */
BUG_ON(!inode);
@@ -88,12 +81,13 @@ void __unionfs_check_inode(const struct inode *inode,
}
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (lower_inode) {
+   memset(poison_ptr, POISON_INUSE, sizeof(void *));
if (unlikely(bindex  istart || bindex  iend)) {
PRINT_CALLER(fname, fxn, line);
printk( Ci5: inode/linode=%p:%p bindex=%d 
   istart/end=%d:%d\n, inode,
   lower_inode, bindex, istart, iend);
-   } else if (unlikely(lower_inode == POISONED_PTR)) {
+   } else if (unlikely(lower_inode == poison_ptr)) {
/* freed inode! */
PRINT_CALLER(fname, fxn, line);
printk( Ci6: inode/linode=%p:%p bindex=%d 
@@ -131,6 +125,7 @@ void __unionfs_check_dentry(const struct dentry *dentry,
struct super_block *sb;
struct vfsmount *lower_mnt;
int printed_caller = 0;
+   void *poison_ptr;
 
BUG_ON(!dentry);
sb = dentry-d_sb;
@@ -257,12 +252,13 @@ void __unionfs_check_dentry(const struct dentry *dentry,
for (bindex = sbstart(sb); bindex  sbmax(sb); bindex++) {
lower_inode = unionfs_lower_inode_idx(inode, bindex);
if (lower_inode) {
+   memset(poison_ptr, POISON_INUSE, sizeof(void *));
if (unlikely(bindex  istart || bindex  iend)) {
PRINT_CALLER(fname, fxn, line);
printk( CI5: dentry/linode=%p:%p bindex=%d 
   istart/end=%d:%d\n, dentry,
   lower_inode, bindex, istart, iend);
-   } else if (unlikely(lower_inode == POISONED_PTR)) {
+   } else if (unlikely(lower_inode == poison_ptr)) {
/* freed inode! */
PRINT_CALLER(fname, fxn, line);
printk( CI6: dentry/linode=%p:%p bindex=%d 
diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h
index 8df44a9..510267f 100644
--- a/fs/unionfs/union.h
+++ b/fs/unionfs/union.h
@@ -43,6 +43,7 @@
 #include linux/fs_stack.h
 #include linux/magic.h
 #include linux/log2.h
+#include linux/poison.h
 
 #include asm/mman.h
 #include asm/system.h
-- 
1.5.2.2

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops

2007-09-25 Thread Kok, Auke
Erez Zadok wrote:
 Signed-off-by: Erez Zadok [EMAIL PROTECTED]
 ---
  fs/unionfs/copyup.c |  102 +-
  1 files changed, 51 insertions(+), 51 deletions(-)
 
 diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c
 index 23ac4c8..e3c5f15 100644
 --- a/fs/unionfs/copyup.c
 +++ b/fs/unionfs/copyup.c
 @@ -36,14 +36,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry,
  
   /* query the actual size of the xattr list */
   list_size = vfs_listxattr(old_lower_dentry, NULL, 0);
 - if (list_size = 0) {
 + if (unlikely(list_size = 0)) {


I've been told several times that adding these is almost always bogus - either 
it
messes up the CPU branch prediction or the compiler/CPU just does a lot better 
at
finding the right way without these hints.

Adding them as a blanket seems rather strange. Have you got any numbers that 
this
really improves performance?

Auke
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Upgrading datastructures between different filesystem versions

2007-09-25 Thread Jim Cromie

kernel learner wrote:

Hi,

ext3 filesystem has 32-bit block address and ext4 filesystem has 
48-bit block address. If a user installs ext4, how will the file 
system handle already existing block with 32 bit values? 

Why should it ?
thats what ext3 is for.
your kernel can have both FS's supported, and would use ext3 drivers for 
ext3

filesystems its asked to mount.

Id expect ext4 drivers handling ext3 filesystems is a distant, secondary 
goal

to getting a fast, reliable, clean 48bit filesystem working.

Can anyone point me to the correct pointer for this backward 
compatibility stuff? I searched for it but cudn't find much info. Is 
the work still pending on this front?


Thanks,
KL


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html