Re: [Ecryptfs-devel] [PATCH 3/11] eCryptfs: read_write.c routines
On Fri, Sep 21, 2007 at 03:05:40PM -0700, Andrew Morton wrote: btw, I'm not really a great admirer of the whole patchset: it does some pretty nasty-looking things: allocating dynamic memory, grabbing the underlying pageframes with virt_to_page(), passing them back into kernel APIs which are supposed to be called from userspace, etc. It's all rather ugly and abusive-looking. Functions higher up the execution stack should be the ones mucking with the Uptodate flag. The patch below addresses some of these issues. I also whipped up a post-patch partial call graph to help illustrate what is going on with the page mapping and Uptodate status in the various eCryptfs read/write paths: http://ecryptfs.sourceforge.net/ecryptfs-pageuptodate-call-graph.png --- The functions that eventually call down to ecryptfs_read_lower(), ecryptfs_decrypt_page(), and ecryptfs_copy_up_encrypted_with_header() should have the responsibility of managing the page Uptodate status. This patch gets rid of some of the ugliness that resulted from trying to push some of the page flag setting too far down the stack. Signed-off-by: Michael Halcrow [EMAIL PROTECTED] --- diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c index b3795f6..bbec711 100644 --- a/fs/ecryptfs/crypto.c +++ b/fs/ecryptfs/crypto.c @@ -605,14 +605,14 @@ int ecryptfs_decrypt_page(struct page *page) printk(KERN_ERR %s: Error attempting to copy page at index [%ld]\n, __FUNCTION__, page-index); - goto out_clear_uptodate; + goto out; } enc_extent_virt = kmalloc(PAGE_CACHE_SIZE, GFP_USER); if (!enc_extent_virt) { rc = -ENOMEM; ecryptfs_printk(KERN_ERR, Error allocating memory for encrypted extent\n); - goto out_clear_uptodate; + goto out; } enc_extent_page = virt_to_page(enc_extent_virt); for (extent_offset = 0; @@ -631,21 +631,17 @@ int ecryptfs_decrypt_page(struct page *page) ecryptfs_printk(KERN_ERR, Error attempting to read lower page; rc = [%d] \n, rc); - goto out_clear_uptodate; + goto out; } rc = ecryptfs_decrypt_extent(page, crypt_stat, enc_extent_page, extent_offset); if (rc) { printk(KERN_ERR %s: Error encrypting extent; rc = [%d]\n, __FUNCTION__, rc); - goto out_clear_uptodate; + goto out; } extent_offset++; } - SetPageUptodate(page); - goto out; -out_clear_uptodate: - ClearPageUptodate(page); out: kfree(enc_extent_virt); return rc; diff --git a/fs/ecryptfs/ecryptfs_kernel.h b/fs/ecryptfs/ecryptfs_kernel.h index bb92b74..ce7a5d4 100644 --- a/fs/ecryptfs/ecryptfs_kernel.h +++ b/fs/ecryptfs/ecryptfs_kernel.h @@ -648,6 +648,6 @@ int ecryptfs_read_lower_page_segment(struct page *page_for_ecryptfs, struct inode *ecryptfs_inode); int ecryptfs_read(char *data, loff_t offset, size_t size, struct file *ecryptfs_file); -struct page *ecryptfs_get1page(struct file *file, loff_t index); +struct page *ecryptfs_get_locked_page(struct file *file, loff_t index); #endif /* #ifndef ECRYPTFS_KERNEL_H */ diff --git a/fs/ecryptfs/mmap.c b/fs/ecryptfs/mmap.c index 4eb09c1..16a7a55 100644 --- a/fs/ecryptfs/mmap.c +++ b/fs/ecryptfs/mmap.c @@ -37,23 +37,27 @@ struct kmem_cache *ecryptfs_lower_page_cache; /** - * ecryptfs_get1page + * ecryptfs_get_locked_page * * Get one page from cache or lower f/s, return error otherwise. * - * Returns unlocked and up-to-date page (if ok), with increased + * Returns locked and up-to-date page (if ok), with increased * refcnt. */ -struct page *ecryptfs_get1page(struct file *file, loff_t index) +struct page *ecryptfs_get_locked_page(struct file *file, loff_t index) { struct dentry *dentry; struct inode *inode; struct address_space *mapping; + struct page *page; dentry = file-f_path.dentry; inode = dentry-d_inode; mapping = inode-i_mapping; - return read_mapping_page(mapping, index, (void *)file); + page = read_mapping_page(mapping, index, (void *)file); + if (!IS_ERR(page)) + lock_page(page); + return page; } /** @@ -146,12 +150,10 @@ ecryptfs_copy_up_encrypted_with_header(struct page *page, kunmap_atomic(page_virt, KM_USER0); flush_dcache_page(page); if (rc) { - ClearPageUptodate(page);
[patch 0/4] 64k pagesize/blocksize fixes
Attached the fixes necessary to support 64k pagesize/blocksize. I think these are useful independent of the large blocksize patchset since there are architectures that support 64k page size and that could use these large buffer sizes without the large buffersize patchset. Are these patches in the right shape to be merged? I rediffed these against 2.6.32-rc8-mm1. I had to fix some things in the second patch (ext2) that may need some review since the way that commits work changed. -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[patch 1/4] Increase limits for 64k page size support for Ext2/3/4
[This patch allows architectures that use 64k blocksizes--like IA64 and PPC64--to use 64k blocks on ext filesystems] The patches to support blocksizes up to PAGESIZE, max 64KB for ext2/3/4,\ were originally from Takashi Sato. http://marc.info/?l=linux-ext4m=115768873518400w=2 It's quite simple to support large block size in ext2/3/4, mostly just enlarge the block size limit. But it is NOT possible to have 64kB blocksize on ext2/3/4 without some changes to the directory handling code. The reason is that an empty 64kB directory block would have a rec_len == (__u16)2^16 == 0, and this would cause an error to be hit in the filesystem. The proposed solution is to put 2 empty records in such a directory, or to special-case an impossible value like rec_len = 0x to handle this. Signed-off-by: Takashi Sato [EMAIL PROTECTED] Signed-off-by: Mingming Cao [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/ext2/super.c |2 +- fs/ext3/super.c |5 - fs/ext4/super.c |5 + include/linux/ext2_fs.h |4 ++-- include/linux/ext3_fs.h |4 ++-- include/linux/ext4_fs.h |4 ++-- 6 files changed, 16 insertions(+), 8 deletions(-) Index: linux-2.6.23-rc8-mm1/fs/ext2/super.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext2/super.c 2007-09-25 14:53:57.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext2/super.c2007-09-25 15:37:34.0 -0700 @@ -856,7 +856,7 @@ static int ext2_fill_super(struct super_ brelse(bh); if (!sb_set_blocksize(sb, blocksize)) { - printk(KERN_ERR EXT2-fs: blocksize too small for device.\n); + printk(KERN_ERR EXT2-fs: bad blocksize %d.\n, blocksize); goto failed_sbi; } Index: linux-2.6.23-rc8-mm1/fs/ext3/super.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext3/super.c 2007-09-25 14:53:57.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext3/super.c2007-09-25 15:37:34.0 -0700 @@ -1625,7 +1625,10 @@ static int ext3_fill_super (struct super } brelse (bh); - sb_set_blocksize(sb, blocksize); + if (!sb_set_blocksize(sb, blocksize)) { + printk(KERN_ERR EXT3-fs: bad blocksize %d.\n, blocksize); + goto out_fail; + } logic_sb_block = (sb_block * EXT3_MIN_BLOCK_SIZE) / blocksize; offset = (sb_block * EXT3_MIN_BLOCK_SIZE) % blocksize; bh = sb_bread(sb, logic_sb_block); Index: linux-2.6.23-rc8-mm1/fs/ext4/super.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext4/super.c 2007-09-25 14:53:57.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext4/super.c2007-09-25 15:37:34.0 -0700 @@ -1652,6 +1652,11 @@ static int ext4_fill_super (struct super goto out_fail; } + if (!sb_set_blocksize(sb, blocksize)) { + printk(KERN_ERR EXT4-fs: bad blocksize %d.\n, blocksize); + goto out_fail; + } + /* * The ext4 superblock will not be buffer aligned for other than 1kB * block sizes. We need to calculate the offset from buffer start. Index: linux-2.6.23-rc8-mm1/include/linux/ext2_fs.h === --- linux-2.6.23-rc8-mm1.orig/include/linux/ext2_fs.h 2007-09-25 14:53:58.0 -0700 +++ linux-2.6.23-rc8-mm1/include/linux/ext2_fs.h2007-09-25 15:37:34.0 -0700 @@ -87,8 +87,8 @@ static inline struct ext2_sb_info *EXT2_ * Macro-instructions used to manage several block sizes */ #define EXT2_MIN_BLOCK_SIZE1024 -#defineEXT2_MAX_BLOCK_SIZE 4096 -#define EXT2_MIN_BLOCK_LOG_SIZE 10 +#define EXT2_MAX_BLOCK_SIZE65536 +#define EXT2_MIN_BLOCK_LOG_SIZE10 #ifdef __KERNEL__ # define EXT2_BLOCK_SIZE(s)((s)-s_blocksize) #else Index: linux-2.6.23-rc8-mm1/include/linux/ext3_fs.h === --- linux-2.6.23-rc8-mm1.orig/include/linux/ext3_fs.h 2007-09-24 17:33:10.0 -0700 +++ linux-2.6.23-rc8-mm1/include/linux/ext3_fs.h2007-09-25 15:37:34.0 -0700 @@ -76,8 +76,8 @@ * Macro-instructions used to manage several block sizes */ #define EXT3_MIN_BLOCK_SIZE1024 -#defineEXT3_MAX_BLOCK_SIZE 4096 -#define EXT3_MIN_BLOCK_LOG_SIZE 10 +#defineEXT3_MAX_BLOCK_SIZE 65536 +#define EXT3_MIN_BLOCK_LOG_SIZE10 #ifdef __KERNEL__ # define EXT3_BLOCK_SIZE(s)((s)-s_blocksize) #else Index: linux-2.6.23-rc8-mm1/include/linux/ext4_fs.h
[patch 3/4] ext3: fix rec_len overflow with 64KB block size
Prevent rec_len from overflow with 64KB blocksize Signed-off-by: Takashi Sato [EMAIL PROTECTED] Signed-off-by: Mingming Cao [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/ext3/dir.c | 13 --- fs/ext3/namei.c | 88 +++- include/linux/ext3_fs.h |9 3 files changed, 91 insertions(+), 19 deletions(-) Index: linux-2.6.23-rc8-mm1/fs/ext3/dir.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext3/dir.c 2007-09-25 14:53:57.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext3/dir.c 2007-09-25 15:41:45.0 -0700 @@ -100,10 +100,11 @@ static int ext3_readdir(struct file * fi unsigned long offset; int i, stored; struct ext3_dir_entry_2 *de; - struct super_block *sb; int err; struct inode *inode = filp-f_path.dentry-d_inode; int ret = 0; + struct super_block *sb = inode-i_sb; + unsigned tail = sb-s_blocksize; sb = inode-i_sb; @@ -167,8 +168,11 @@ revalidate: * readdir(2), then we might be pointing to an invalid * dirent right now. Scan from the start of the block * to make sure. */ - if (filp-f_version != inode-i_version) { - for (i = 0; i sb-s_blocksize i offset; ) { + if (tail EXT3_DIR_MAX_REC_LEN) + tail = EXT3_DIR_MAX_REC_LEN; + + if (filp-f_version != inode-i_version) { + for (i = 0; i tail i offset; ) { de = (struct ext3_dir_entry_2 *) (bh-b_data + i); /* It's too expensive to do a full @@ -189,7 +193,7 @@ revalidate: } while (!error filp-f_pos inode-i_size - offset sb-s_blocksize) { + offset tail) { de = (struct ext3_dir_entry_2 *) (bh-b_data + offset); if (!ext3_check_dir_entry (ext3_readdir, inode, de, bh, offset)) { @@ -225,6 +229,7 @@ revalidate: } filp-f_pos += le16_to_cpu(de-rec_len); } + filp-f_pos = EXT3_DIR_ADJUST_TAIL_OFFS(filp-f_pos, sb-s_blocksize); offset = 0; brelse (bh); } Index: linux-2.6.23-rc8-mm1/fs/ext3/namei.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext3/namei.c 2007-09-24 17:33:10.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext3/namei.c2007-09-25 15:41:45.0 -0700 @@ -263,9 +263,13 @@ static struct stats dx_show_leaf(struct unsigned names = 0, space = 0; char *base = (char *) de; struct dx_hash_info h = *hinfo; + unsigned tail = size; printk(names: ); - while ((char *) de base + size) + if (tail EXT3_DIR_MAX_REC_LEN) + tail = EXT3_DIR_MAX_REC_LEN; + + while ((char *) de base + tail) { if (de-inode) { @@ -708,8 +712,12 @@ static int dx_make_map (struct ext3_dir_ int count = 0; char *base = (char *) de; struct dx_hash_info h = *hinfo; + unsigned tail = size; + + if (tail EXT3_DIR_MAX_REC_LEN) + tail = EXT3_DIR_MAX_REC_LEN; - while ((char *) de base + size) + while ((char *) de base + tail) { if (de-name_len de-inode) { ext3fs_dirhash(de-name, de-name_len, h); @@ -808,9 +816,13 @@ static inline int search_dirblock(struct int de_len; const char *name = dentry-d_name.name; int namelen = dentry-d_name.len; + unsigned tail = dir-i_sb-s_blocksize; de = (struct ext3_dir_entry_2 *) bh-b_data; - dlimit = bh-b_data + dir-i_sb-s_blocksize; + if (tail EXT3_DIR_MAX_REC_LEN) + tail = EXT3_DIR_MAX_REC_LEN; + + dlimit = bh-b_data + tail; while ((char *) de dlimit) { /* this code is executed quadratically often */ /* do minimal checking `by hand' */ @@ -1156,6 +1168,9 @@ static struct ext3_dir_entry_2* dx_pack_ unsigned rec_len = 0; prev = to = de; + if (size EXT3_DIR_MAX_REC_LEN) + size = EXT3_DIR_MAX_REC_LEN; + while ((char*)de base + size) { next = (struct ext3_dir_entry_2 *) ((char *) de + le16_to_cpu(de-rec_len)); @@ -1237,8 +1252,15 @@ static struct ext3_dir_entry_2 *do_split /* Fancy dance to stay within two buffers */ de2 = dx_move_dirents(data1, data2, map + split, count - split); de = dx_pack_dirents(data1,blocksize); - de-rec_len
[patch 2/4] ext2: fix rec_len overflow for 64KB block size
[2/4] ext2: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Signed-off-by: Takashi Sato [EMAIL PROTECTED] Signed-off-by: Mingming Cao [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/ext2/dir.c | 46 -- include/linux/ext2_fs.h | 13 + 2 files changed, 49 insertions(+), 10 deletions(-) Index: linux-2.6.23-rc8-mm1/fs/ext2/dir.c === --- linux-2.6.23-rc8-mm1.orig/fs/ext2/dir.c 2007-09-25 15:59:34.0 -0700 +++ linux-2.6.23-rc8-mm1/fs/ext2/dir.c 2007-09-25 16:02:51.0 -0700 @@ -105,9 +105,9 @@ static void ext2_check_page(struct page goto out; } for (offs = 0; offs = limit - EXT2_DIR_REC_LEN(1); offs += rec_len) { + offs = EXT2_DIR_ADJUST_TAIL_OFFS(offs, chunk_size); p = (ext2_dirent *)(kaddr + offs); rec_len = le16_to_cpu(p-rec_len); - if (rec_len EXT2_DIR_REC_LEN(1)) goto Eshort; if (rec_len 3) @@ -119,6 +119,7 @@ static void ext2_check_page(struct page if (le32_to_cpu(p-inode) max_inumber) goto Einumber; } + offs = EXT2_DIR_ADJUST_TAIL_OFFS(offs, chunk_size); if (offs != limit) goto Eend; out: @@ -294,6 +295,7 @@ ext2_readdir (struct file * filp, void * de = (ext2_dirent *)(kaddr+offset); limit = kaddr + ext2_last_byte(inode, n) - EXT2_DIR_REC_LEN(1); for ( ;(char*)de = limit; de = ext2_next_entry(de)) { + de = EXT2_DIR_ADJUST_TAIL_ADDR(kaddr, de, sb-s_blocksize); if (de-rec_len == 0) { ext2_error(sb, __FUNCTION__, zero-length directory entry); @@ -316,8 +318,10 @@ ext2_readdir (struct file * filp, void * return 0; } } + filp-f_pos = EXT2_DIR_ADJUST_TAIL_OFFS(filp-f_pos, sb-s_blocksize); filp-f_pos += le16_to_cpu(de-rec_len); } + filp-f_pos = EXT2_DIR_ADJUST_TAIL_OFFS(filp-f_pos, sb-s_blocksize); ext2_put_page(page); } return 0; @@ -354,13 +358,14 @@ struct ext2_dir_entry_2 * ext2_find_entr start = 0; n = start; do { - char *kaddr; + char *kaddr, *page_start; page = ext2_get_page(dir, n); if (!IS_ERR(page)) { - kaddr = page_address(page); + kaddr = page_start = page_address(page); de = (ext2_dirent *) kaddr; kaddr += ext2_last_byte(dir, n) - reclen; while ((char *) de = kaddr) { + de = EXT2_DIR_ADJUST_TAIL_ADDR(page_start, de, dir-i_sb-s_blocksize); if (de-rec_len == 0) { ext2_error(dir-i_sb, __FUNCTION__, zero-length directory entry); @@ -428,6 +433,7 @@ void ext2_set_link(struct inode *dir, st unsigned len = le16_to_cpu(de-rec_len); int err; + len = EXT2_DIR_ADJUST_TAIL_OFFS(pos, len); lock_page(page); err = __ext2_write_begin(NULL, page-mapping, pos, len, AOP_FLAG_UNINTERRUPTIBLE, page, NULL); @@ -459,6 +465,7 @@ int ext2_add_link (struct dentry *dentry char *kaddr; loff_t pos; int err; + char *page_start = NULL; /* * We take care of directory expansion in the same loop. @@ -473,16 +480,29 @@ int ext2_add_link (struct dentry *dentry if (IS_ERR(page)) goto out; lock_page(page); - kaddr = page_address(page); + kaddr = page_start = page_address(page); dir_end = kaddr + ext2_last_byte(dir, n); de = (ext2_dirent *)kaddr; - kaddr += PAGE_CACHE_SIZE - reclen; + if (chunk_size EXT2_DIR_MAX_REC_LEN) + kaddr += PAGE_CACHE_SIZE - reclen; + else + kaddr += PAGE_CACHE_SIZE - + (chunk_size - EXT2_DIR_MAX_REC_LEN) - reclen; + while ((char *)de = kaddr) { + de = EXT2_DIR_ADJUST_TAIL_ADDR(page_start, de, chunk_size); if ((char *)de == dir_end) { /* We hit i_size */ name_len = 0; - rec_len = chunk_size; - de-rec_len =
[00/17] Virtual Compound Page Support V1
RFC-V1 - Support for all compound functions for virtual compound pages (including the compound_nth_page() necessary for LBS mmap support) - Fix various bugs - Fix i386 build Currently there is a strong tendency to avoid larger page allocations in the kernel because of past fragmentation issues and the current defragmentation methods are still evolving. It is not clear to what extend they can provide reliable allocations for higher order pages (plus the definition of reliable seems to be in the eye of the beholder). We use vmalloc allocations in many locations to provide a safe way to allocate larger arrays. That is due to the danger of higher order allocations failing. Virtual Compound pages allow the use of regular page allocator allocations that will fall back only if there is an actual problem with acquiring a higher order page. This patch set provides a way for a higher page allocation to fall back. Instead of a physically contiguous page a virtually contiguous page is provided. The functionality of the vmalloc layer is used to provide the necessary page tables and control structures to establish a virtually contiguous area. Advantages: - If higher order allocations are failing then virtual compound pages consisting of a series of order-0 pages can stand in for those allocations. - Reliability as long as the vmalloc layer can provide virtual mappings. - Ability to reduce the use of vmalloc layer significantly by using physically contiguous memory instead of virtual contiguous memory. Most uses of vmalloc() can be converted to page allocator calls. - The use of physically contiguous memory instead of vmalloc may allow the use larger TLB entries thus reducing TLB pressure. Also reduces the need for page table walks. Disadvantages: - In order to use fall back the logic accessing the memory must be aware that the memory could be backed by a virtual mapping and take precautions. virt_to_page() and page_address() may not work and vmalloc_to_page() and vmalloc_address() (introduced through this patch set) may have to be called. - Virtual mappings are less efficient than physical mappings. Performance will drop once virtual fall back occurs. - Virtual mappings have more memory overhead. vm_area control structures page tables, page arrays etc need to be allocated and managed to provide virtual mappings. The patchset provides this functionality in stages. Stage 1 introduces the basic fall back mechanism necessary to replace vmalloc allocations with alloc_page(GFP_VFALLBACK, order, ) which signifies to the page allocator that a higher order is to be found but a virtual mapping may stand in if there is an issue with fragmentation. Stage 1 functionality does not allow allocation and freeing of virtual mappings from interrupt contexts. The stage 1 series ends with the conversion of a few key uses of vmalloc in the VM to alloc_pages() for the allocation of sparsemems memmap table and the wait table in each zone. Other uses of vmalloc could be converted in the same way. Stage 2 functionality enhances the fallback even more allowing allocation and frees in interrupt context. SLUB is then modified to use the virtual mappings for slab caches that are marked with SLAB_VFALLBACK. If a slab cache is marked this way then we drop all the restraints regarding page order and allocate good large memory areas that fit lots of objects so that we rarely have to use the slow paths. Two slab caches--the dentry cache and the buffer_heads--are then flagged that way. Others could be converted in the same way. The patch set also provides a debugging aid through setting CONFIG_VFALLBACK_ALWAYS If set then all GFP_VFALLBACK allocations fall back to the virtual mappings. This is useful for verification tests. The test of this patch set was done by enabling that options and compiling a kernel. The patch set is also available via git from the largeblock git tree via git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/largeblocksize.git vcompound -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[03/17] i386: Resolve dependency of asm-i386/pgtable.h on highmem.h
pgtable.h does not include highmem.h but uses various constants from highmem.h. We cannot include highmem.h because highmem.h will in turn include many other include files that also depend on pgtable.h So move the definitions from highmem.h into pgtable.h. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/asm-i386/highmem.h |6 -- include/asm-i386/pgtable.h |8 2 files changed, 8 insertions(+), 6 deletions(-) Index: linux-2.6/include/asm-i386/highmem.h === --- linux-2.6.orig/include/asm-i386/highmem.h 2007-09-20 23:54:57.0 -0700 +++ linux-2.6/include/asm-i386/highmem.h2007-09-20 23:55:40.0 -0700 @@ -38,11 +38,6 @@ extern pte_t *pkmap_page_table; * easily, subsequent pte tables have to be allocated in one physical * chunk of RAM. */ -#ifdef CONFIG_X86_PAE -#define LAST_PKMAP 512 -#else -#define LAST_PKMAP 1024 -#endif /* * Ordering is: * @@ -58,7 +53,6 @@ extern pte_t *pkmap_page_table; * VMALLOC_START * high_memory */ -#define PKMAP_BASE ( (FIXADDR_BOOT_START - PAGE_SIZE*(LAST_PKMAP + 1)) PMD_MASK ) #define LAST_PKMAP_MASK (LAST_PKMAP-1) #define PKMAP_NR(virt) ((virt-PKMAP_BASE) PAGE_SHIFT) #define PKMAP_ADDR(nr) (PKMAP_BASE + ((nr) PAGE_SHIFT)) Index: linux-2.6/include/asm-i386/pgtable.h === --- linux-2.6.orig/include/asm-i386/pgtable.h 2007-09-20 23:55:16.0 -0700 +++ linux-2.6/include/asm-i386/pgtable.h2007-09-20 23:56:21.0 -0700 @@ -81,6 +81,14 @@ void paging_init(void); #define VMALLOC_OFFSET (8*1024*1024) #define VMALLOC_START (((unsigned long) high_memory + \ 2*VMALLOC_OFFSET-1) ~(VMALLOC_OFFSET-1)) +#ifdef CONFIG_X86_PAE +#define LAST_PKMAP 512 +#else +#define LAST_PKMAP 1024 +#endif + +#define PKMAP_BASE ( (FIXADDR_BOOT_START - PAGE_SIZE*(LAST_PKMAP + 1)) PMD_MASK ) + #ifdef CONFIG_HIGHMEM # define VMALLOC_END (PKMAP_BASE-2*PAGE_SIZE) #else -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[01/17] Vmalloc: Move vmalloc_to_page to mm/vmalloc.
We already have page table manipulation for vmalloc in vmalloc.c. Move the vmalloc_to_page() function there as well. Move the definitions for vmalloc related functions in mm.h to before the functions dealing with compound pages because they will soon need to use them. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm.h |5 +++-- mm/memory.c| 40 mm/vmalloc.c | 38 ++ 3 files changed, 41 insertions(+), 42 deletions(-) Index: linux-2.6/mm/memory.c === --- linux-2.6.orig/mm/memory.c 2007-09-24 16:55:28.0 -0700 +++ linux-2.6/mm/memory.c 2007-09-24 16:55:32.0 -0700 @@ -2727,46 +2727,6 @@ int make_pages_present(unsigned long add return ret == len ? 0 : -1; } -/* - * Map a vmalloc()-space virtual address to the physical page. - */ -struct page * vmalloc_to_page(void * vmalloc_addr) -{ - unsigned long addr = (unsigned long) vmalloc_addr; - struct page *page = NULL; - pgd_t *pgd = pgd_offset_k(addr); - pud_t *pud; - pmd_t *pmd; - pte_t *ptep, pte; - - if (!pgd_none(*pgd)) { - pud = pud_offset(pgd, addr); - if (!pud_none(*pud)) { - pmd = pmd_offset(pud, addr); - if (!pmd_none(*pmd)) { - ptep = pte_offset_map(pmd, addr); - pte = *ptep; - if (pte_present(pte)) - page = pte_page(pte); - pte_unmap(ptep); - } - } - } - return page; -} - -EXPORT_SYMBOL(vmalloc_to_page); - -/* - * Map a vmalloc()-space virtual address to the physical page frame number. - */ -unsigned long vmalloc_to_pfn(void * vmalloc_addr) -{ - return page_to_pfn(vmalloc_to_page(vmalloc_addr)); -} - -EXPORT_SYMBOL(vmalloc_to_pfn); - #if !defined(__HAVE_ARCH_GATE_AREA) #if defined(AT_SYSINFO_EHDR) Index: linux-2.6/mm/vmalloc.c === --- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:55:28.0 -0700 +++ linux-2.6/mm/vmalloc.c 2007-09-24 16:55:32.0 -0700 @@ -166,6 +166,44 @@ int map_vm_area(struct vm_struct *area, } EXPORT_SYMBOL_GPL(map_vm_area); +/* + * Map a vmalloc()-space virtual address to the physical page. + */ +struct page *vmalloc_to_page(void *vmalloc_addr) +{ + unsigned long addr = (unsigned long) vmalloc_addr; + struct page *page = NULL; + pgd_t *pgd = pgd_offset_k(addr); + pud_t *pud; + pmd_t *pmd; + pte_t *ptep, pte; + + if (!pgd_none(*pgd)) { + pud = pud_offset(pgd, addr); + if (!pud_none(*pud)) { + pmd = pmd_offset(pud, addr); + if (!pmd_none(*pmd)) { + ptep = pte_offset_map(pmd, addr); + pte = *ptep; + if (pte_present(pte)) + page = pte_page(pte); + pte_unmap(ptep); + } + } + } + return page; +} +EXPORT_SYMBOL(vmalloc_to_page); + +/* + * Map a vmalloc()-space virtual address to the physical page frame number. + */ +unsigned long vmalloc_to_pfn(void *vmalloc_addr) +{ + return page_to_pfn(vmalloc_to_page(vmalloc_addr)); +} +EXPORT_SYMBOL(vmalloc_to_pfn); + static struct vm_struct *__get_vm_area_node(unsigned long size, unsigned long flags, unsigned long start, unsigned long end, int node, gfp_t gfp_mask) Index: linux-2.6/include/linux/mm.h === --- linux-2.6.orig/include/linux/mm.h 2007-09-24 16:55:28.0 -0700 +++ linux-2.6/include/linux/mm.h2007-09-24 16:57:23.0 -0700 @@ -294,6 +294,9 @@ static inline int get_page_unless_zero(s return atomic_inc_not_zero(page-_count); } +struct page *vmalloc_to_page(void *addr); +unsigned long vmalloc_to_pfn(void *addr); + static inline struct page *compound_head(struct page *page) { if (unlikely(PageTail(page))) @@ -1160,8 +1163,6 @@ static inline unsigned long vma_pages(st pgprot_t vm_get_page_prot(unsigned long vm_flags); struct vm_area_struct *find_extend_vma(struct mm_struct *, unsigned long addr); -struct page *vmalloc_to_page(void *addr); -unsigned long vmalloc_to_pfn(void *addr); int remap_pfn_range(struct vm_area_struct *, unsigned long addr, unsigned long pfn, unsigned long size, pgprot_t); int vm_insert_page(struct vm_area_struct *, unsigned long addr, struct page *); -- - To unsubscribe from this list: send the line
[05/17] vmalloc: clean up page array indexing
The page array is repeatedly indexed both in vunmap and vmalloc_area_node(). Add a temporary variable to make it easier to read (and easier to patch later). Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/vmalloc.c | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) Index: linux-2.6/mm/vmalloc.c === --- linux-2.6.orig/mm/vmalloc.c 2007-09-18 13:22:16.0 -0700 +++ linux-2.6/mm/vmalloc.c 2007-09-18 13:22:17.0 -0700 @@ -383,8 +383,10 @@ static void __vunmap(const void *addr, i int i; for (i = 0; i area-nr_pages; i++) { - BUG_ON(!area-pages[i]); - __free_page(area-pages[i]); + struct page *page = area-pages[i]; + + BUG_ON(!page); + __free_page(page); } if (area-flags VM_VPAGES) @@ -488,15 +490,19 @@ void *__vmalloc_area_node(struct vm_stru } for (i = 0; i area-nr_pages; i++) { + struct page *page; + if (node 0) - area-pages[i] = alloc_page(gfp_mask); + page = alloc_page(gfp_mask); else - area-pages[i] = alloc_pages_node(node, gfp_mask, 0); - if (unlikely(!area-pages[i])) { + page = alloc_pages_node(node, gfp_mask, 0); + + if (unlikely(!page)) { /* Successfully allocated i pages, free them in __vunmap() */ area-nr_pages = i; goto fail; } + area-pages[i] = page; } if (map_vm_area(area, prot, pages)) -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[06/17] vunmap: return page array passed on vmap()
Make vunmap return the page array that was used at vmap. This is useful if one has no structures to track the page array but simply stores the virtual address somewhere. The disposition of the page array can be decided upon after vunmap. vfree() may now also be used instead of vunmap which will release the page array after vunmap'ping it. As noted by Kamezawa: The same subsystem that provides the page array to vmap must must use its own method to dispose of the page array. If vfree() is called to free the page array then the page array must either be 1. Allocated via the slab allocator 2. Allocated via vmalloc but then VM_VPAGES must have been passed at vunmap to specify that a vfree is needed. RFC-v1: - Add comment explaining how to use vfree() to dispose of the page array passed on vmap(). Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/vmalloc.h |2 +- mm/vmalloc.c| 33 +++-- 2 files changed, 24 insertions(+), 11 deletions(-) Index: linux-2.6/include/linux/vmalloc.h === --- linux-2.6.orig/include/linux/vmalloc.h 2007-09-24 15:52:53.0 -0700 +++ linux-2.6/include/linux/vmalloc.h 2007-09-24 15:59:15.0 -0700 @@ -49,7 +49,7 @@ extern void vfree(const void *addr); extern void *vmap(struct page **pages, unsigned int count, unsigned long flags, pgprot_t prot); -extern void vunmap(const void *addr); +extern struct page **vunmap(const void *addr); extern int remap_vmalloc_range(struct vm_area_struct *vma, void *addr, unsigned long pgoff); Index: linux-2.6/mm/vmalloc.c === --- linux-2.6.orig/mm/vmalloc.c 2007-09-24 15:56:49.0 -0700 +++ linux-2.6/mm/vmalloc.c 2007-09-24 16:02:10.0 -0700 @@ -356,17 +356,18 @@ struct vm_struct *remove_vm_area(const v return v; } -static void __vunmap(const void *addr, int deallocate_pages) +static struct page **__vunmap(const void *addr, int deallocate_pages) { struct vm_struct *area; + struct page **pages; if (!addr) - return; + return NULL; if ((PAGE_SIZE-1) (unsigned long)addr) { printk(KERN_ERR Trying to vfree() bad address (%p)\n, addr); WARN_ON(1); - return; + return NULL; } area = remove_vm_area(addr); @@ -374,29 +375,30 @@ static void __vunmap(const void *addr, i printk(KERN_ERR Trying to vfree() nonexistent vm area (%p)\n, addr); WARN_ON(1); - return; + return NULL; } + pages = area-pages; debug_check_no_locks_freed(addr, area-size); if (deallocate_pages) { int i; for (i = 0; i area-nr_pages; i++) { - struct page *page = area-pages[i]; + struct page *page = pages[i]; BUG_ON(!page); __free_page(page); } if (area-flags VM_VPAGES) - vfree(area-pages); + vfree(pages); else - kfree(area-pages); + kfree(pages); } kfree(area); - return; + return pages; } /** @@ -424,11 +426,13 @@ EXPORT_SYMBOL(vfree); * which was created from the page array passed to vmap(). * * Must not be called in interrupt context. + * + * Returns a pointer to the array of pointers to page structs */ -void vunmap(const void *addr) +struct page **vunmap(const void *addr) { BUG_ON(in_interrupt()); - __vunmap(addr, 0); + return __vunmap(addr, 0); } EXPORT_SYMBOL(vunmap); @@ -441,6 +445,13 @@ EXPORT_SYMBOL(vunmap); * * Maps @count pages from @pages into contiguous kernel virtual * space. + * + * The page array may be freed via vfree() on the virtual address + * returned. In that case the page array must be allocated via + * the slab allocator. If the page array was allocated via + * vmalloc then VM_VPAGES must be specified in the flags. There is + * no support for vfree() to free a page array allocated via the + * page allocator. */ void *vmap(struct page **pages, unsigned int count, unsigned long flags, pgprot_t prot) @@ -453,6 +464,8 @@ void *vmap(struct page **pages, unsigned area = get_vm_area((count PAGE_SHIFT), flags); if (!area) return NULL; + area-pages = pages; + area-nr_pages = count; if (map_vm_area(area, prot, pages)) { vunmap(area-addr); return NULL; -- - To unsubscribe from this list: send the line unsubscribe
[04/17] is_vmalloc_addr(): Check if an address is within the vmalloc boundaries
is_vmalloc_addr() is used in a couple of places. Add a version to vmalloc.h and replace the other checks. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- drivers/net/cxgb3/cxgb3_offload.c |4 +--- fs/ntfs/malloc.h |3 +-- fs/proc/kcore.c |2 +- fs/xfs/linux-2.6/kmem.c |3 +-- fs/xfs/linux-2.6/xfs_buf.c|3 +-- include/linux/mm.h|8 mm/sparse.c | 10 +- 7 files changed, 14 insertions(+), 19 deletions(-) Index: linux-2.6/include/linux/mm.h === --- linux-2.6.orig/include/linux/mm.h 2007-09-24 18:32:35.0 -0700 +++ linux-2.6/include/linux/mm.h2007-09-24 18:33:03.0 -0700 @@ -297,6 +297,14 @@ static inline int get_page_unless_zero(s struct page *vmalloc_to_page(const void *addr); unsigned long vmalloc_to_pfn(const void *addr); +/* Determine if an address is within the vmalloc range */ +static inline int is_vmalloc_addr(const void *x) +{ + unsigned long addr = (unsigned long)x; + + return addr = VMALLOC_START addr VMALLOC_END; +} + static inline struct page *compound_head(struct page *page) { if (unlikely(PageTail(page))) Index: linux-2.6/mm/sparse.c === --- linux-2.6.orig/mm/sparse.c 2007-09-24 18:30:46.0 -0700 +++ linux-2.6/mm/sparse.c 2007-09-24 18:33:03.0 -0700 @@ -289,17 +289,9 @@ got_map_ptr: return ret; } -static int vaddr_in_vmalloc_area(void *addr) -{ - if (addr = (void *)VMALLOC_START - addr (void *)VMALLOC_END) - return 1; - return 0; -} - static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) { - if (vaddr_in_vmalloc_area(memmap)) + if (is_vmalloc_addr(memmap)) vfree(memmap); else free_pages((unsigned long)memmap, Index: linux-2.6/drivers/net/cxgb3/cxgb3_offload.c === --- linux-2.6.orig/drivers/net/cxgb3/cxgb3_offload.c2007-09-24 18:30:46.0 -0700 +++ linux-2.6/drivers/net/cxgb3/cxgb3_offload.c 2007-09-24 18:33:03.0 -0700 @@ -1035,9 +1035,7 @@ void *cxgb_alloc_mem(unsigned long size) */ void cxgb_free_mem(void *addr) { - unsigned long p = (unsigned long)addr; - - if (p = VMALLOC_START p VMALLOC_END) + if (is_vmalloc_addr(addr)) vfree(addr); else kfree(addr); Index: linux-2.6/fs/ntfs/malloc.h === --- linux-2.6.orig/fs/ntfs/malloc.h 2007-09-24 18:30:46.0 -0700 +++ linux-2.6/fs/ntfs/malloc.h 2007-09-24 18:33:03.0 -0700 @@ -85,8 +85,7 @@ static inline void *ntfs_malloc_nofs_nof static inline void ntfs_free(void *addr) { - if (likely(((unsigned long)addr VMALLOC_START) || - ((unsigned long)addr = VMALLOC_END ))) { + if (!is_vmalloc_addr(addr)) { kfree(addr); /* free_page((unsigned long)addr); */ return; Index: linux-2.6/fs/proc/kcore.c === --- linux-2.6.orig/fs/proc/kcore.c 2007-09-24 18:30:46.0 -0700 +++ linux-2.6/fs/proc/kcore.c 2007-09-24 18:33:03.0 -0700 @@ -325,7 +325,7 @@ read_kcore(struct file *file, char __use if (m == NULL) { if (clear_user(buffer, tsz)) return -EFAULT; - } else if ((start = VMALLOC_START) (start VMALLOC_END)) { + } else if (is_vmalloc_addr((void *)start)) { char * elf_buf; struct vm_struct *m; unsigned long curstart = start; Index: linux-2.6/fs/xfs/linux-2.6/kmem.c === --- linux-2.6.orig/fs/xfs/linux-2.6/kmem.c 2007-09-24 18:30:46.0 -0700 +++ linux-2.6/fs/xfs/linux-2.6/kmem.c 2007-09-24 18:33:03.0 -0700 @@ -92,8 +92,7 @@ kmem_zalloc_greedy(size_t *size, size_t void kmem_free(void *ptr, size_t size) { - if (((unsigned long)ptr VMALLOC_START) || - ((unsigned long)ptr = VMALLOC_END)) { + if (!is_vmalloc_addr(ptr)) { kfree(ptr); } else { vfree(ptr); Index: linux-2.6/fs/xfs/linux-2.6/xfs_buf.c === --- linux-2.6.orig/fs/xfs/linux-2.6/xfs_buf.c 2007-09-24 18:30:46.0 -0700 +++ linux-2.6/fs/xfs/linux-2.6/xfs_buf.c2007-09-24 18:33:03.0 -0700 @@ -696,8 +696,7 @@ static inline struct page * mem_to_page( void*addr) { - if (((unsigned long)addr VMALLOC_START) || -
[07/17] vmalloc_address(): Determine vmalloc address from page struct
Sometimes we need to figure out which vmalloc address is in use for a certain page struct. There is no easy way to figure out the vmalloc address from the page struct. So simply search through the kernel page tables to find the address. This is a fairly expensive process. Use sparingly (or provide a better implementation). Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/mm.h |1 mm/vmalloc.c | 77 + 2 files changed, 78 insertions(+) Index: linux-2.6/mm/vmalloc.c === --- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:59:54.0 -0700 +++ linux-2.6/mm/vmalloc.c 2007-09-24 17:00:07.0 -0700 @@ -196,6 +196,83 @@ struct page *vmalloc_to_page(const void EXPORT_SYMBOL(vmalloc_to_page); /* + * Determine vmalloc address from a page struct. + * + * Linear search through all ptes of the vmalloc area. + */ +static unsigned long vaddr_pte_range(pmd_t *pmd, unsigned long addr, + unsigned long end, unsigned long pfn) +{ + pte_t *pte; + + pte = pte_offset_kernel(pmd, addr); + do { + pte_t ptent = *pte; + if (pte_present(ptent) pte_pfn(ptent) == pfn) + return addr; + } while (pte++, addr += PAGE_SIZE, addr != end); + return 0; +} + +static inline unsigned long vaddr_pmd_range(pud_t *pud, unsigned long addr, + unsigned long end, unsigned long pfn) +{ + pmd_t *pmd; + unsigned long next; + unsigned long n; + + pmd = pmd_offset(pud, addr); + do { + next = pmd_addr_end(addr, end); + if (pmd_none_or_clear_bad(pmd)) + continue; + n = vaddr_pte_range(pmd, addr, next, pfn); + if (n) + return n; + } while (pmd++, addr = next, addr != end); + return 0; +} + +static inline unsigned long vaddr_pud_range(pgd_t *pgd, unsigned long addr, + unsigned long end, unsigned long pfn) +{ + pud_t *pud; + unsigned long next; + unsigned long n; + + pud = pud_offset(pgd, addr); + do { + next = pud_addr_end(addr, end); + if (pud_none_or_clear_bad(pud)) + continue; + n = vaddr_pmd_range(pud, addr, next, pfn); + if (n) + return n; + } while (pud++, addr = next, addr != end); + return 0; +} + +void *vmalloc_address(struct page *page) +{ + pgd_t *pgd; + unsigned long next, n; + unsigned long addr = VMALLOC_START; + unsigned long pfn = page_to_pfn(page); + + pgd = pgd_offset_k(VMALLOC_START); + do { + next = pgd_addr_end(addr, VMALLOC_END); + if (pgd_none_or_clear_bad(pgd)) + continue; + n = vaddr_pud_range(pgd, addr, next, pfn); + if (n) + return (void *)n; + } while (pgd++, addr = next, addr VMALLOC_END); + return NULL; +} +EXPORT_SYMBOL(vmalloc_address); + +/* * Map a vmalloc()-space virtual address to the physical page frame number. */ unsigned long vmalloc_to_pfn(const void *vmalloc_addr) Index: linux-2.6/include/linux/mm.h === --- linux-2.6.orig/include/linux/mm.h 2007-09-24 17:00:33.0 -0700 +++ linux-2.6/include/linux/mm.h2007-09-24 17:00:42.0 -0700 @@ -296,6 +296,7 @@ static inline int get_page_unless_zero(s struct page *vmalloc_to_page(const void *addr); unsigned long vmalloc_to_pfn(const void *addr); +void *vmalloc_address(struct page *); /* Determine if an address is within the vmalloc range */ static inline int is_vmalloc_addr(const void *x) -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[11/17] GFP_VFALLBACK for zone wait table.
Currently vmalloc is used for the zone wait table possibly generating the need to generate lots of TLBs to access the tables. We can now use GFP_VFALLBACK to attempt the use of a physically contiguous page that can then use the large kernel TLBs. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/page_alloc.c |4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) Index: linux-2.6/mm/page_alloc.c === --- linux-2.6.orig/mm/page_alloc.c 2007-09-24 18:48:06.0 -0700 +++ linux-2.6/mm/page_alloc.c 2007-09-24 18:48:16.0 -0700 @@ -2550,7 +2550,9 @@ int zone_wait_table_init(struct zone *zo * To use this new node's memory, further consideration will be * necessary. */ - zone-wait_table = (wait_queue_head_t *)vmalloc(alloc_size); + zone-wait_table = (wait_queue_head_t *) + __get_free_pages(GFP_VFALLBACK, + get_order(alloc_size)); } if (!zone-wait_table) return -ENOMEM; -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[14/17] Allow bit_waitqueue to wait on a bit in a vmalloc area
If bit waitqueue is passed a virtual address then it must use vmalloc_to_page instead of virt_to_page to get to the page struct. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- kernel/wait.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/kernel/wait.c === --- linux-2.6.orig/kernel/wait.c2007-09-20 19:03:42.0 -0700 +++ linux-2.6/kernel/wait.c 2007-09-20 19:07:42.0 -0700 @@ -245,7 +245,7 @@ EXPORT_SYMBOL(wake_up_bit); fastcall wait_queue_head_t *bit_waitqueue(void *word, int bit) { const int shift = BITS_PER_LONG == 32 ? 5 : 6; - const struct zone *zone = page_zone(virt_to_page(word)); + const struct zone *zone = page_zone(addr_to_page(word)); unsigned long val = (unsigned long)word shift | bit; return zone-wait_table[hash_long(val, zone-wait_table_bits)]; -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[16/17] Allow virtual fallback for buffer_heads
This is in particular useful for large I/Os because it will allow 100 allocs from the SLUB fast path without having to go to the page allocator. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/buffer.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux-2.6/fs/buffer.c === --- linux-2.6.orig/fs/buffer.c 2007-09-18 15:44:37.0 -0700 +++ linux-2.6/fs/buffer.c 2007-09-18 15:44:51.0 -0700 @@ -3008,7 +3008,8 @@ void __init buffer_init(void) int nrpages; bh_cachep = KMEM_CACHE(buffer_head, - SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD| + SLAB_VFALLBACK); /* * Limit the bh occupancy to 10% of ZONE_NORMAL -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[17/17] Allow virtual fallback for dentries
Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- fs/dcache.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) Index: linux-2.6/fs/dcache.c === --- linux-2.6.orig/fs/dcache.c 2007-09-24 16:47:43.0 -0700 +++ linux-2.6/fs/dcache.c 2007-09-24 17:03:15.0 -0700 @@ -2118,7 +2118,8 @@ static void __init dcache_init(unsigned * of the dcache. */ dentry_cache = KMEM_CACHE(dentry, - SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD); + SLAB_RECLAIM_ACCOUNT|SLAB_PANIC|SLAB_MEM_SPREAD| + SLAB_VFALLBACK); register_shrinker(dcache_shrinker); -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK
SLAB_VFALLBACK can be specified for selected slab caches. If fallback is available then the conservative settings for higher order allocations are overridden. We then request an order that can accomodate at mininum 100 objects. The size of an individual slab allocation is allowed to reach up to 256k (order 6 on i386, order 4 on IA64). Implementing fallback requires special handling of virtual mappings in the free path. However, the impact is minimal since we already check the address if its NULL or ZERO_SIZE_PTR. No additional cachelines are touched if we do not fall back. However, if we need to handle a virtual compound page then walk the kernel page table in the free paths to determine the page struct. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/slab.h |1 include/linux/slub_def.h |1 mm/slub.c| 52 +++ 3 files changed, 32 insertions(+), 22 deletions(-) Index: linux-2.6/include/linux/slab.h === --- linux-2.6.orig/include/linux/slab.h 2007-09-24 20:34:14.0 -0700 +++ linux-2.6/include/linux/slab.h 2007-09-24 20:35:09.0 -0700 @@ -19,6 +19,7 @@ * The ones marked DEBUG are only valid if CONFIG_SLAB_DEBUG is set. */ #define SLAB_DEBUG_FREE0x0100UL/* DEBUG: Perform (expensive) checks on free */ +#define SLAB_VFALLBACK 0x0200UL/* May fall back to vmalloc */ #define SLAB_RED_ZONE 0x0400UL/* DEBUG: Red zone objs in a cache */ #define SLAB_POISON0x0800UL/* DEBUG: Poison objects */ #define SLAB_HWCACHE_ALIGN 0x2000UL/* Align objs on cache lines */ Index: linux-2.6/mm/slub.c === --- linux-2.6.orig/mm/slub.c2007-09-24 20:34:14.0 -0700 +++ linux-2.6/mm/slub.c 2007-09-24 20:35:09.0 -0700 @@ -285,7 +285,7 @@ static inline int check_valid_pointer(st if (!object) return 1; - base = page_address(page); + base = page_to_addr(page); if (object base || object = base + s-objects * s-size || (object - base) % s-size) { return 0; @@ -470,7 +470,7 @@ static void slab_fix(struct kmem_cache * static void print_trailer(struct kmem_cache *s, struct page *page, u8 *p) { unsigned int off; /* Offset of last byte */ - u8 *addr = page_address(page); + u8 *addr = page_to_addr(page); print_tracking(s, p); @@ -648,7 +648,7 @@ static int slab_pad_check(struct kmem_ca if (!(s-flags SLAB_POISON)) return 1; - start = page_address(page); + start = page_to_addr(page); end = start + (PAGE_SIZE s-order); length = s-objects * s-size; remainder = end - (start + length); @@ -1049,11 +1049,7 @@ static struct page *allocate_slab(struct struct page * page; int pages = 1 s-order; - if (s-order) - flags |= __GFP_COMP; - - if (s-flags SLAB_CACHE_DMA) - flags |= SLUB_DMA; + flags |= s-gfpflags; if (node == -1) page = alloc_pages(flags, s-order); @@ -1107,7 +1103,7 @@ static struct page *new_slab(struct kmem SLAB_STORE_USER | SLAB_TRACE)) SetSlabDebug(page); - start = page_address(page); + start = page_to_addr(page); end = start + s-objects * s-size; if (unlikely(s-flags SLAB_POISON)) @@ -1139,7 +1135,7 @@ static void __free_slab(struct kmem_cach void *p; slab_pad_check(s, page); - for_each_object(p, s, page_address(page)) + for_each_object(p, s, page_to_addr(page)) check_object(s, page, p, 0); ClearSlabDebug(page); } @@ -1789,10 +1785,9 @@ static inline int slab_order(int size, i return order; } -static inline int calculate_order(int size) +static inline int calculate_order(int size, int min_objects, int max_order) { int order; - int min_objects; int fraction; /* @@ -1803,13 +1798,12 @@ static inline int calculate_order(int si * First we reduce the acceptable waste in a slab. Then * we reduce the minimum objects required in a slab. */ - min_objects = slub_min_objects; while (min_objects 1) { fraction = 8; while (fraction = 4) { order = slab_order(size, min_objects, - slub_max_order, fraction); - if (order = slub_max_order) + max_order, fraction); + if (order = max_order) return order; fraction /= 2;
[13/17] Virtual compound page freeing in interrupt context
If we are in an interrupt context then simply defer the free via a workqueue. Removing a virtual mappping *must* be done with interrupts enabled since tlb_xx functions are called that rely on interrupts for processor to processor communications. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/page_alloc.c | 23 ++- 1 file changed, 22 insertions(+), 1 deletion(-) Index: linux-2.6/mm/page_alloc.c === --- linux-2.6.orig/mm/page_alloc.c 2007-09-25 00:20:56.0 -0700 +++ linux-2.6/mm/page_alloc.c 2007-09-25 00:20:57.0 -0700 @@ -1294,7 +1294,12 @@ abort: return NULL; } -static void vcompound_free(void *addr) +/* + * Virtual Compound freeing functions. This is complicated by the vmalloc + * layer not being able to free virtual allocations when interrupts are + * disabled. So we defer the frees via a workqueue if necessary. + */ +static void __vcompound_free(void *addr) { struct page **pages; int i; @@ -1319,6 +1324,22 @@ static void vcompound_free(void *addr) kfree(pages); } +static void vcompound_free_work(struct work_struct *w) +{ + __vcompound_free((void *)w); +} + +static noinline void vcompound_free(void *addr) +{ + if (in_interrupt()) { + struct work_struct *w = addr; + + INIT_WORK(w, vcompound_free_work); + schedule_work(w); + } else + __vcompound_free(addr); +} + /* * This is the 'heart' of the zoned buddy allocator. */ -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[10/17] Use GFP_VFALLBACK for sparsemem.
Sparsemem currently attempts first to do a physically contiguous mapping and then falls back to vmalloc. The same thing can now be accomplished using GFP_VFALLBACK. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/sparse.c | 23 +++ 1 file changed, 3 insertions(+), 20 deletions(-) Index: linux-2.6/mm/sparse.c === --- linux-2.6.orig/mm/sparse.c 2007-09-19 18:05:34.0 -0700 +++ linux-2.6/mm/sparse.c 2007-09-19 18:27:25.0 -0700 @@ -269,32 +269,15 @@ void __init sparse_init(void) #ifdef CONFIG_MEMORY_HOTPLUG static struct page *__kmalloc_section_memmap(unsigned long nr_pages) { - struct page *page, *ret; unsigned long memmap_size = sizeof(struct page) * nr_pages; - page = alloc_pages(GFP_KERNEL|__GFP_NOWARN, get_order(memmap_size)); - if (page) - goto got_map_page; - - ret = vmalloc(memmap_size); - if (ret) - goto got_map_ptr; - - return NULL; -got_map_page: - ret = (struct page *)pfn_to_kaddr(page_to_pfn(page)); -got_map_ptr: - memset(ret, 0, memmap_size); - - return ret; + return (struct page *)__get_free_pages(GFP_VFALLBACK, + get_order(memmap_size)); } static void __kfree_section_memmap(struct page *memmap, unsigned long nr_pages) { - if (is_vmalloc_addr(memmap)) - vfree(memmap); - else - free_pages((unsigned long)memmap, + free_pages((unsigned long)memmap, get_order(sizeof(struct page) * nr_pages)); } -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[09/17] VFALLBACK: Debugging aid
Virtual fallbacks are rare and thus subtle bugs may creep in if we do not test the fallbacks. CONFIG_VFALLBACK_ALWAYS makes all GFP_VFALLBACK allocations fall back to virtual mapping. Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- lib/Kconfig.debug | 11 +++ mm/page_alloc.c |6 ++ 2 files changed, 17 insertions(+) Index: linux-2.6/mm/page_alloc.c === --- linux-2.6.orig/mm/page_alloc.c 2007-09-24 18:48:03.0 -0700 +++ linux-2.6/mm/page_alloc.c 2007-09-24 18:58:52.0 -0700 @@ -1208,6 +1208,12 @@ zonelist_scan: } } +#ifdef CONFIG_VFALLBACK_ALWAYS + if ((gfp_mask __GFP_VFALLBACK) + system_state == SYSTEM_RUNNING) + return vcompound_alloc(gfp_mask, order, + zonelist, alloc_flags); +#endif page = buffered_rmqueue(zonelist, zone, order, gfp_mask); if (page) break; Index: linux-2.6/lib/Kconfig.debug === --- linux-2.6.orig/lib/Kconfig.debug2007-09-24 18:30:45.0 -0700 +++ linux-2.6/lib/Kconfig.debug 2007-09-24 18:48:06.0 -0700 @@ -105,6 +105,17 @@ config DETECT_SOFTLOCKUP can be detected via the NMI-watchdog, on platforms that support it.) +config VFALLBACK_ALWAYS + bool Always fall back to Virtual Compound pages + default y + help + Virtual compound pages are only allocated if there is no linear + memory available. They are a fallback and errors created by the + use of virtual mappings instead of linear ones may not surface + because of their infrequent use. This option makes every + allocation that allows a fallback to a virtual mapping use + the virtual mapping. May have a significant performance impact. + config SCHED_DEBUG bool Collect scheduler debugging info depends on DEBUG_KERNEL PROC_FS -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[12/17] Virtual Compound page allocation from interrupt context.
In an interrupt context we cannot wait for the vmlist_lock in __get_vm_area_node(). So use a trylock instead. If the trylock fails then the atomic allocation will fail and subsequently be retried. This only works because the flush_cache_vunmap in use for allocation is never performing any IPIs in contrast to flush_tlb_... in use for freeing. flush_cache_vunmap is only used on architectures with a virtually mapped cache (xtensa, pa-risc). [Note: Nick Piggin is working on a scheme to make this simpler by no longer requiring flushes] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- mm/vmalloc.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) Index: linux-2.6/mm/vmalloc.c === --- linux-2.6.orig/mm/vmalloc.c 2007-09-24 16:03:49.0 -0700 +++ linux-2.6/mm/vmalloc.c 2007-09-24 16:04:32.0 -0700 @@ -289,7 +289,6 @@ static struct vm_struct *__get_vm_area_n unsigned long align = 1; unsigned long addr; - BUG_ON(in_interrupt()); if (flags VM_IOREMAP) { int bit = fls(size); @@ -314,7 +313,14 @@ static struct vm_struct *__get_vm_area_n */ size += PAGE_SIZE; - write_lock(vmlist_lock); + if (gfp_mask __GFP_WAIT) + write_lock(vmlist_lock); + else { + if (!write_trylock(vmlist_lock)) { + kfree(area); + return NULL; + } + } for (p = vmlist; (tmp = *p) != NULL ;p = tmp-next) { if ((unsigned long)tmp-addr addr) { if((unsigned long)tmp-addr + tmp-size = addr) -- - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[08/17] GFP_VFALLBACK: Allow fallback of compound pages to virtual mappings
Add a new gfp flag __GFP_VFALLBACK If specified during a higher order allocation then the system will fall back to vmap and attempt to create a virtually contiguous area instead of a physically contiguous area. In many cases the virtually contiguous area can stand in for the physically contiguous area (with some loss of performance). The pages used for VFALLBACK are marked with a new flag PageVcompound(page). The mark is necessary since we have to know upon free if we have to destroy a virtual mapping. No additional flag is consumed through the use of PG_swapcache together with PG_compound (similar to PageHead() and PageTail()). Also add a new function compound_nth_page(page, n) to find the nth page of a compound page. For real compound pages this simply reduces to page + n. For virtual compound pages we need to consult the page tables to figure out the nth page. Add new page to address and vice versa functions. struct page *addr_to_page(const void *address); void *page_to_addr(struct page *); The new conversion functions allow the conversion of vmalloc areas to the corresponding page structs that back it and vice versa. If the addresses or the page struct is not part of a vmalloc function then fall back to virt_to_page and page_address(). Signed-off-by: Christoph Lameter [EMAIL PROTECTED] --- include/linux/gfp.h|5 + include/linux/mm.h | 33 +-- include/linux/page-flags.h | 18 ++ mm/page_alloc.c| 131 - mm/vmalloc.c | 10 +++ 5 files changed, 179 insertions(+), 18 deletions(-) Index: linux-2.6/mm/page_alloc.c === --- linux-2.6.orig/mm/page_alloc.c 2007-09-25 10:22:16.0 -0700 +++ linux-2.6/mm/page_alloc.c 2007-09-25 10:22:36.0 -0700 @@ -60,6 +60,8 @@ long nr_swap_pages; int percpu_pagelist_fraction; static void __free_pages_ok(struct page *page, unsigned int order); +static struct page *vcompound_alloc(gfp_t, int, + struct zonelist *, unsigned long); /* * results with 256, 32 in the lowmem_reserve sysctl: @@ -251,7 +253,7 @@ static void prep_compound_page(struct pa set_compound_order(page, order); __SetPageHead(page); for (i = 1; i nr_pages; i++) { - struct page *p = page + i; + struct page *p = compound_nth_page(page, i); __SetPageTail(p); p-first_page = page; @@ -266,17 +268,23 @@ static void destroy_compound_page(struct if (unlikely(compound_order(page) != order)) bad_page(page); - if (unlikely(!PageHead(page))) - bad_page(page); - __ClearPageHead(page); for (i = 1; i nr_pages; i++) { - struct page *p = page + i; + struct page *p = compound_nth_page(page, i); if (unlikely(!PageTail(p) | (p-first_page != page))) bad_page(page); __ClearPageTail(p); } + + /* +* The PageHead is important since it determines how operations on +* a compound page have to be performed. We can only tear the head +* down after all the tail pages are done. +*/ + if (unlikely(!PageHead(page))) + bad_page(page); + __ClearPageHead(page); } static inline void prep_zero_page(struct page *page, int order, gfp_t gfp_flags) @@ -1230,6 +1238,82 @@ try_next_zone: } /* + * Virtual Compound Page support. + * + * Virtual Compound Pages are used to fall back to order 0 allocations if large + * linear mappings are not available and __GFP_VFALLBACK is set. They are + * formatted according to compound page conventions. I.e. following + * page-first_page if PageTail(page) is set can be used to determine the + * head page. + */ +static noinline struct page *vcompound_alloc(gfp_t gfp_mask, int order, + struct zonelist *zonelist, unsigned long alloc_flags) +{ + void *addr; + struct page *page; + int i; + int nr_pages = 1 order; + struct page **pages = kmalloc(nr_pages * sizeof(struct page *), + gfp_mask GFP_LEVEL_MASK); + + if (!pages) + return NULL; + + for (i = 0; i nr_pages; i++) { + page = get_page_from_freelist(gfp_mask ~__GFP_VFALLBACK, + 0, zonelist, alloc_flags); + if (!page) + goto abort; + + /* Sets PageCompound which makes PageHead(page) true */ + __SetPageVcompound(page); + pages[i] = page; + } + addr = vmap(pages, nr_pages, VM_MAP, PAGE_KERNEL); + if (!addr) + goto abort; + + prep_compound_page(pages[0], order); +
Re: [patch 2/4] ext2: fix rec_len overflow for 64KB block size
On Sep 25, 2007 16:30 -0700, Christoph Lameter wrote: [2/4] ext2: fix rec_len overflow - prevent rec_len from overflow with 64KB blocksize Signed-off-by: Takashi Sato [EMAIL PROTECTED] Signed-off-by: Mingming Cao [EMAIL PROTECTED] Signed-off-by: Christoph Lameter [EMAIL PROTECTED] Note that we just got a cleaner implemantation of this code on the ext4 mailing list from Jan Kara yesterday. Please use that one instead, in thread Avoid rec_len overflow with 64KB block size instead. Cheers, Andreas -- Andreas Dilger Principal Software Engineer Cluster File Systems, Inc. - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[GIT PULL -mm] 00/25 Unionfs updates/cleanups/fixes
The following is a series of patches related to Unionfs. Aside from a few minor cleanups/fixes, the two main changes are (1) lower nameidata support so we can stack on nfsv4, and (2) un/likely optimizations. These patches were tested (where appropriate) on our 2.6.23-rc8 latest code, as well as the backports to 2.6.{22,21,20,19,18,9} on ext2/3/4, xfs, reiserfs, nfs2/3/4, jffs2, ramfs, tmpfs, cramfs, and squashfs (where available). See http://unionfs.filesystems.org/ to download backported unionfs code. Please pull from the 'master' branch of git://git.kernel.org/pub/scm/linux/kernel/git/ezk/unionfs.git to receive the following: Erez Zadok (22): Unionfs: display informational messages only if debug is on Unionfs: cast page-index loff_t before shifting Unionfs: minor coding style updates Unionfs: add lower nameidata debugging support Unionfs: lower nameidata support for nfsv4 Unionfs: add un/likely conditionals on common fileops Unionfs: add un/likely conditionals on copyup ops Unionfs: add un/likely conditionals on debug ops Unionfs: add un/likely conditionals on dentry ops Unionfs: add un/likely conditionals on dir ops Unionfs: add un/likely conditionals on headers Unionfs: add un/likely conditionals on fileops Unionfs: add un/likely conditionals on inode ops Unionfs: add un/likely conditionals on lookup ops Unionfs: add un/likely conditionals on super ops Unionfs: add un/likely conditionals on mmap ops Unionfs: add un/likely conditionals on rename ops Unionfs: add un/likely conditionals on readdir ops Unionfs: add un/likely conditionals on common subr Unionfs: add un/likely conditionals on unlink ops Unionfs: add un/likely conditionals on xattr ops Unionfs: use poison.h for safe poison pointers Josef 'Jeff' Sipek (2): Unionfs: Simplify unionfs_get_nlinks Unionfs: Remove unused #defines Olivier Blin (1): Unionfs: cache-coherency fixes commonfops.c | 98 +++ copyup.c | 102 debug.c | 140 ++-- dentry.c | 87 +++ dirfops.c| 22 +++--- dirhelper.c | 30 - fanout.h | 13 ++-- file.c | 38 ++-- inode.c | 186 +++ lookup.c | 60 +++ main.c | 102 mmap.c | 33 +- rdstate.c| 15 ++-- rename.c | 96 +++--- sioq.c |4 - subr.c | 67 ++--- super.c | 90 ++-- union.h | 19 +++--- unlink.c | 32 +- xattr.c | 12 +-- 20 files changed, 647 insertions(+), 599 deletions(-) --- Erez Zadok [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 05/25] Unionfs: cast page-index loff_t before shifting
Fixes bugs in number promotion/demotion computation, as per http://lkml.org/lkml/2007/9/20/17 Signed-off-by: Erez Zadok [EMAIL PROTECTED] Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] --- fs/unionfs/mmap.c |5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c index 88ef6a6..37af979 100644 --- a/fs/unionfs/mmap.c +++ b/fs/unionfs/mmap.c @@ -179,7 +179,8 @@ static int unionfs_do_readpage(struct file *file, struct page *page) * may be a little slower, but a lot safer, as the VFS does a lot of * the necessary magic for us. */ - offset = lower_file-f_pos = (page-index PAGE_CACHE_SHIFT); + offset = lower_file-f_pos = + ((loff_t) page-index PAGE_CACHE_SHIFT); old_fs = get_fs(); set_fs(KERNEL_DS); err = vfs_read(lower_file, page_data, PAGE_CACHE_SIZE, @@ -289,7 +290,7 @@ static int unionfs_commit_write(struct file *file, struct page *page, BUG_ON(lower_file == NULL); page_data = (char *)kmap(page); - lower_file-f_pos = (page-index PAGE_CACHE_SHIFT) + from; + lower_file-f_pos = ((loff_t) page-index PAGE_CACHE_SHIFT) + from; /* * SP: I use vfs_write instead of copying page data and the -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 24/25] Unionfs: add un/likely conditionals on xattr ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/xattr.c | 12 ++-- 1 files changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/unionfs/xattr.c b/fs/unionfs/xattr.c index 7f77d7d..bd2de06 100644 --- a/fs/unionfs/xattr.c +++ b/fs/unionfs/xattr.c @@ -23,14 +23,14 @@ void *unionfs_xattr_alloc(size_t size, size_t limit) { void *ptr; - if (size limit) + if (unlikely(size limit)) return ERR_PTR(-E2BIG); if (!size) /* size request, no buffer is needed */ return NULL; ptr = kmalloc(size, GFP_KERNEL); - if (!ptr) + if (unlikely(!ptr)) return ERR_PTR(-ENOMEM); return ptr; } @@ -48,7 +48,7 @@ ssize_t unionfs_getxattr(struct dentry *dentry, const char *name, void *value, unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } @@ -77,7 +77,7 @@ int unionfs_setxattr(struct dentry *dentry, const char *name, unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } @@ -106,7 +106,7 @@ int unionfs_removexattr(struct dentry *dentry, const char *name) unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } @@ -135,7 +135,7 @@ ssize_t unionfs_listxattr(struct dentry *dentry, char *list, size_t size) unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 14/25] Unionfs: add un/likely conditionals on headers
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/fanout.h | 13 - fs/unionfs/union.h |4 ++-- 2 files changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h index 51aa0de..6405399 100644 --- a/fs/unionfs/fanout.h +++ b/fs/unionfs/fanout.h @@ -308,17 +308,20 @@ static inline void unionfs_copy_attr_times(struct inode *upper) int bindex; struct inode *lower; - if (!upper || ibstart(upper) 0) + if (unlikely(!upper || ibstart(upper) 0)) return; for (bindex=ibstart(upper); bindex = ibend(upper); bindex++) { lower = unionfs_lower_inode_idx(upper, bindex); - if (!lower) + if (unlikely(!lower)) continue; /* not all lower dir objects may exist */ - if (timespec_compare(upper-i_mtime, lower-i_mtime) 0) + if (unlikely(timespec_compare(upper-i_mtime, + lower-i_mtime) 0)) upper-i_mtime = lower-i_mtime; - if (timespec_compare(upper-i_ctime, lower-i_ctime) 0) + if (likely(timespec_compare(upper-i_ctime, + lower-i_ctime) 0)) upper-i_ctime = lower-i_ctime; - if (timespec_compare(upper-i_atime, lower-i_atime) 0) + if (likely(timespec_compare(upper-i_atime, + lower-i_atime) 0)) upper-i_atime = lower-i_atime; } } diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index d27844d..8df44a9 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -472,7 +472,7 @@ static inline struct vfsmount *unionfs_mntget(struct dentry *dentry, mnt = mntget(unionfs_lower_mnt_idx(dentry, bindex)); #ifdef CONFIG_UNION_FS_DEBUG - if (!mnt) + if (unlikely(!mnt)) printk(KERN_DEBUG unionfs_mntget: mnt=%p bindex=%d\n, mnt, bindex); #endif /* CONFIG_UNION_FS_DEBUG */ @@ -484,7 +484,7 @@ static inline void unionfs_mntput(struct dentry *dentry, int bindex) { struct vfsmount *mnt; - if (!dentry bindex 0) + if (unlikely(!dentry bindex 0)) return; BUG_ON(!dentry || bindex 0); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 03/25] Unionfs: display informational messages only if debug is on
This is to avoid filling the console/logs with messages that are primarily of debugging use. Signed-off-by: Erez Zadok [EMAIL PROTECTED] Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] --- fs/unionfs/commonfops.c |4 ++-- fs/unionfs/dentry.c |6 +++--- fs/unionfs/union.h |4 3 files changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c index 87cbb09..e69ccf6 100644 --- a/fs/unionfs/commonfops.c +++ b/fs/unionfs/commonfops.c @@ -394,8 +394,8 @@ int unionfs_file_revalidate(struct file *file, bool willwrite) if (willwrite IS_WRITE_FLAG(file-f_flags) !IS_WRITE_FLAG(unionfs_lower_file(file)-f_flags) is_robranch(dentry)) { - printk(KERN_DEBUG unionfs: do delay copyup of \%s\\n, - dentry-d_name.name); + dprintk(KERN_DEBUG unionfs: do delay copyup of \%s\\n, + dentry-d_name.name); err = do_delayed_copyup(file); } diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c index 9e0742d..08b5722 100644 --- a/fs/unionfs/dentry.c +++ b/fs/unionfs/dentry.c @@ -46,9 +46,9 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, /* if the dentry is unhashed, do NOT revalidate */ if (d_deleted(dentry)) { - printk(KERN_DEBUG unionfs: unhashed dentry being - revalidated: %*s\n, - dentry-d_name.len, dentry-d_name.name); + dprintk(KERN_DEBUG unionfs: unhashed dentry being + revalidated: %*s\n, + dentry-d_name.len, dentry-d_name.name); goto out; } diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index 140b8ae..5e9843b 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -507,6 +507,8 @@ static inline void unionfs_mntput(struct dentry *dentry, int bindex) #ifdef CONFIG_UNION_FS_DEBUG +#define dprintk(args...) printk(args) + /* useful for tracking code reachability */ #define UDBG printk(DBG:%s:%s:%d\n,__FILE__,__FUNCTION__,__LINE__) @@ -543,6 +545,8 @@ extern void __show_inode_counts(const struct inode *inode, #else /* not CONFIG_UNION_FS_DEBUG */ +#define dprintk(args...) do { } while (0) + /* we leave useful hooks for these check functions throughout the code */ #define unionfs_check_inode(i) do { } while(0) #define unionfs_check_dentry(d)do { } while(0) -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 21/25] Unionfs: add un/likely conditionals on readdir ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/rdstate.c | 15 --- 1 files changed, 8 insertions(+), 7 deletions(-) diff --git a/fs/unionfs/rdstate.c b/fs/unionfs/rdstate.c index 0a18d5c..7ec7f95 100644 --- a/fs/unionfs/rdstate.c +++ b/fs/unionfs/rdstate.c @@ -45,7 +45,7 @@ int unionfs_init_filldir_cache(void) void unionfs_destroy_filldir_cache(void) { - if (unionfs_filldir_cachep) + if (likely(unionfs_filldir_cachep)) kmem_cache_destroy(unionfs_filldir_cachep); } @@ -72,7 +72,8 @@ static int guesstimate_hash_size(struct inode *inode) return UNIONFS_I(inode)-hashsize; for (bindex = ibstart(inode); bindex = ibend(inode); bindex++) { - if (!(lower_inode = unionfs_lower_inode_idx(inode, bindex))) + lower_inode = unionfs_lower_inode_idx(inode, bindex); + if (unlikely(!lower_inode)) continue; if (lower_inode-i_size == DENTPAGE) @@ -136,7 +137,7 @@ struct unionfs_dir_state *alloc_rdstate(struct inode *inode, int bindex) sizeof(struct list_head); rdstate = kmalloc(mallocsize, GFP_KERNEL); - if (!rdstate) + if (unlikely(!rdstate)) return NULL; spin_lock(UNIONFS_I(inode)-rdlock); @@ -217,7 +218,7 @@ struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate, * if the duplicate is in this branch, then the file * system is corrupted. */ - if (cursor-bindex == rdstate-bindex) { + if (unlikely(cursor-bindex == rdstate-bindex)) { printk(KERN_DEBUG unionfs: filldir: possible I/O error: a file is duplicated in the same branch %d: %s\n, @@ -227,7 +228,7 @@ struct filldir_node *find_filldir_node(struct unionfs_dir_state *rdstate, } } - if (!found) + if (unlikely(!found)) cursor = NULL; return cursor; @@ -249,7 +250,7 @@ int add_filldir_node(struct unionfs_dir_state *rdstate, const char *name, head = (rdstate-list[index]); new = kmem_cache_alloc(unionfs_filldir_cachep, GFP_KERNEL); - if (!new) { + if (unlikely(!new)) { err = -ENOMEM; goto out; } @@ -264,7 +265,7 @@ int add_filldir_node(struct unionfs_dir_state *rdstate, const char *name, new-name = new-iname; else { new-name = kmalloc(namelen + 1, GFP_KERNEL); - if (!new-name) { + if (unlikely(!new-name)) { kmem_cache_free(unionfs_filldir_cachep, new); new = NULL; goto out; -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 19/25] Unionfs: add un/likely conditionals on mmap ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/mmap.c | 28 ++-- 1 files changed, 14 insertions(+), 14 deletions(-) diff --git a/fs/unionfs/mmap.c b/fs/unionfs/mmap.c index 37af979..1cea075 100644 --- a/fs/unionfs/mmap.c +++ b/fs/unionfs/mmap.c @@ -84,7 +84,7 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc) * resort to RAIF's page pointer flipping trick.) */ lower_page = find_lock_page(lower_inode-i_mapping, page-index); - if (!lower_page) { + if (unlikely(!lower_page)) { err = AOP_WRITEPAGE_ACTIVATE; set_page_dirty(page); goto out; @@ -102,7 +102,7 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc) BUG_ON(!lower_inode-i_mapping-a_ops-writepage); /* workaround for some lower file systems: see big comment on top */ - if (wbc-for_writepages !wbc-fs_private) + if (unlikely(wbc-for_writepages !wbc-fs_private)) wbc-for_writepages = 0; /* call lower writepage (expects locked page) */ @@ -111,12 +111,12 @@ static int unionfs_writepage(struct page *page, struct writeback_control *wbc) wbc-for_writepages = saved_for_writepages; /* restore value */ /* b/c find_lock_page locked it and -writepage unlocks on success */ - if (err) + if (unlikely(err)) unlock_page(lower_page); /* b/c grab_cache_page increased refcnt */ page_cache_release(lower_page); - if (err 0) { + if (unlikely(err 0)) { ClearPageUptodate(page); goto out; } @@ -160,7 +160,7 @@ static int unionfs_do_readpage(struct file *file, struct page *page) char *page_data = NULL; loff_t offset; - if (!UNIONFS_F(file)) { + if (unlikely(!UNIONFS_F(file))) { err = -ENOENT; goto out; } @@ -189,7 +189,7 @@ static int unionfs_do_readpage(struct file *file, struct page *page) kunmap(page); - if (err 0) + if (unlikely(err 0)) goto out; err = 0; @@ -199,7 +199,7 @@ static int unionfs_do_readpage(struct file *file, struct page *page) flush_dcache_page(page); out: - if (err == 0) + if (likely(err == 0)) SetPageUptodate(page); else ClearPageUptodate(page); @@ -212,13 +212,13 @@ static int unionfs_readpage(struct file *file, struct page *page) int err; unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, false))) + if (unlikely((err = unionfs_file_revalidate(file, false goto out; unionfs_check_file(file); err = unionfs_do_readpage(file, page); - if (!err) { + if (likely(!err)) { touch_atime(unionfs_lower_mnt(file-f_path.dentry), unionfs_lower_dentry(file-f_path.dentry)); unionfs_copy_attr_times(file-f_path.dentry-d_inode); @@ -276,14 +276,14 @@ static int unionfs_commit_write(struct file *file, struct page *page, BUG_ON(file == NULL); unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, true))) + if (unlikely((err = unionfs_file_revalidate(file, true goto out; unionfs_check_file(file); inode = page-mapping-host; lower_inode = unionfs_lower_inode(inode); - if (UNIONFS_F(file) != NULL) + if (likely(UNIONFS_F(file) != NULL)) lower_file = unionfs_lower_file(file); /* FIXME: is this assertion right here? */ @@ -307,7 +307,7 @@ static int unionfs_commit_write(struct file *file, struct page *page, kunmap(page); - if (err 0) + if (unlikely(err 0)) goto out; inode-i_blocks = lower_inode-i_blocks; @@ -320,7 +320,7 @@ static int unionfs_commit_write(struct file *file, struct page *page, mark_inode_dirty_sync(inode); out: - if (err 0) + if (unlikely(err 0)) ClearPageUptodate(page); unionfs_read_unlock(file-f_path.dentry-d_sb); @@ -347,7 +347,7 @@ static void unionfs_sync_page(struct page *page) * do is ensure that pending I/O gets done. */ lower_page = find_lock_page(lower_inode-i_mapping, page-index); - if (!lower_page) { + if (unlikely(!lower_page)) { printk(KERN_DEBUG unionfs: find_lock_page failed\n); goto out; } -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 08/25] Unionfs: lower nameidata support for nfsv4
Pass nameidata structures as needed to the lower file system, support LOOKUP_ACCESS/OPEN intents. This makes unionfs work on top of nfsv4. Signed-off-by: Erez Zadok [EMAIL PROTECTED] Acked-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] --- fs/unionfs/dentry.c | 11 +-- fs/unionfs/inode.c |8 +++- fs/unionfs/lookup.c | 20 +--- 3 files changed, 33 insertions(+), 6 deletions(-) diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c index b21f1e3..52bcb18 100644 --- a/fs/unionfs/dentry.c +++ b/fs/unionfs/dentry.c @@ -156,8 +156,15 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, if (!lower_dentry || !lower_dentry-d_op || !lower_dentry-d_op-d_revalidate) continue; - if (!lower_dentry-d_op-d_revalidate(lower_dentry, - lowernd)) + /* +* Don't pass nameidata to lower file system, because we +* don't want an arbitrary lower file being opened or +* returned to us: it may be useless to us because of the +* fanout nature of unionfs (cf. file/directory open-file +* invariants). We will open lower files as and when needed +* later on. +*/ + if (!lower_dentry-d_op-d_revalidate(lower_dentry, NULL)) valid = false; } diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c index f8b2c88..7ee4760 100644 --- a/fs/unionfs/inode.c +++ b/fs/unionfs/inode.c @@ -27,6 +27,7 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, struct dentry *lower_parent_dentry = NULL; char *name = NULL; int valid = 0; + struct nameidata lower_nd; unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); @@ -113,7 +114,12 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, goto out; } - err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode, nd); + err = init_lower_nd(lower_nd, LOOKUP_CREATE); + if (err 0) + goto out; + err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode, +lower_nd); + release_lower_nd(lower_nd, err); if (!err) { err = PTR_ERR(unionfs_interpose(dentry, parent-i_sb, 0)); diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c index 963d622..2109714 100644 --- a/fs/unionfs/lookup.c +++ b/fs/unionfs/lookup.c @@ -583,6 +583,11 @@ void update_bstart(struct dentry *dentry) * Inside that nd structure, this function may also return an allocated * struct file (for open intents). The caller, when done with this nd, must * kfree the intent file (using release_lower_nd). + * + * XXX: this code, and the callers of this code, should be redone using + * vfs_path_lookup() when (1) the nameidata structure is refactored into a + * separate intent-structure, and (2) open_namei() is broken into a VFS-only + * function and a method that other file systems can call. */ int init_lower_nd(struct nameidata *nd, unsigned int flags) { @@ -597,11 +602,16 @@ int init_lower_nd(struct nameidata *nd, unsigned int flags) #endif /* ALLOC_LOWER_ND_FILE */ memset(nd, 0, sizeof(struct nameidata)); + if (!flags) + return err; switch (flags) { case LOOKUP_CREATE: - nd-flags = LOOKUP_CREATE; - nd-intent.open.flags = FMODE_READ | FMODE_WRITE | O_CREAT; + nd-intent.open.flags |= O_CREAT; + /* fall through: shared code for create/open cases */ + case LOOKUP_OPEN: + nd-flags = flags; + nd-intent.open.flags |= (FMODE_READ | FMODE_WRITE); #ifdef ALLOC_LOWER_ND_FILE file = kzalloc(sizeof(struct file), GFP_KERNEL); if (!file) { @@ -611,11 +621,15 @@ int init_lower_nd(struct nameidata *nd, unsigned int flags) nd-intent.open.file = file; #endif /* ALLOC_LOWER_ND_FILE */ break; + case LOOKUP_ACCESS: + nd-flags = flags; + break; default: /* * We should never get here, for now. * We can add new cases here later on. */ + dprintk(unionfs: unknown nameidata flag 0x%x\n, flags); BUG(); break; } @@ -627,7 +641,7 @@ void release_lower_nd(struct nameidata *nd, int err) { if (!nd-intent.open.file) return; - if (!err) + else if (!err) release_open_intent(nd); #ifdef ALLOC_LOWER_ND_FILE kfree(nd-intent.open.file); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at
[PATCH 07/25] Unionfs: add lower nameidata debugging support
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/debug.c | 20 fs/unionfs/dentry.c |4 +++- fs/unionfs/inode.c |8 +++- fs/unionfs/union.h |4 4 files changed, 34 insertions(+), 2 deletions(-) diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c index 2d15fb0..9546a41 100644 --- a/fs/unionfs/debug.c +++ b/fs/unionfs/debug.c @@ -415,6 +415,26 @@ void __unionfs_check_file(const struct file *file, __unionfs_check_dentry(dentry,fname,fxn,line); } +void __unionfs_check_nd(const struct nameidata *nd, + const char *fname, const char *fxn, int line) +{ + struct file *file; + int printed_caller = 0; + + if (!nd) + return; + if (nd-flags LOOKUP_OPEN) { + file = nd-intent.open.file; + if (file-f_path.dentry + strcmp(file-f_dentry-d_sb-s_type-name, unionfs)) { + PRINT_CALLER(fname, fxn, line); + printk( CND1: lower_file of type %s\n, + file-f_path.dentry-d_sb-s_type-name); + BUG(); + } + } +} + /* useful to track vfsmount leaks that could cause EBUSY on unmount */ void __show_branch_counts(const struct super_block *sb, const char *file, const char *fxn, int line) diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c index d9bb199..b21f1e3 100644 --- a/fs/unionfs/dentry.c +++ b/fs/unionfs/dentry.c @@ -418,8 +418,10 @@ static int unionfs_d_revalidate(struct dentry *dentry, struct nameidata *nd) unionfs_lock_dentry(dentry); err = __unionfs_d_revalidate_chain(dentry, nd, false); unionfs_unlock_dentry(dentry); - if (err 0) /* true==1: dentry is valid */ + if (err 0) { /* true==1: dentry is valid */ unionfs_check_dentry(dentry); + unionfs_check_nd(nd); + } unionfs_read_unlock(dentry-d_sb); diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c index de78e26..f8b2c88 100644 --- a/fs/unionfs/inode.c +++ b/fs/unionfs/inode.c @@ -138,8 +138,10 @@ out: unionfs_read_unlock(dentry-d_sb); unionfs_check_inode(parent); - if (!err) + if (!err) { unionfs_check_dentry(dentry-d_parent); + unionfs_check_nd(nd); + } unionfs_check_dentry(dentry); return err; } @@ -186,6 +188,7 @@ static struct dentry *unionfs_lookup(struct inode *parent, unionfs_check_inode(parent); unionfs_check_dentry(dentry); unionfs_check_dentry(dentry-d_parent); + unionfs_check_nd(nd); unionfs_read_unlock(dentry-d_sb); return ret; @@ -856,6 +859,7 @@ static void *unionfs_follow_link(struct dentry *dentry, struct nameidata *nd) out: unionfs_check_dentry(dentry); + unionfs_check_nd(nd); unionfs_read_unlock(dentry-d_sb); return ERR_PTR(err); } @@ -872,6 +876,7 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd, unionfs_unlock_dentry(dentry); unionfs_check_dentry(dentry); + unionfs_check_nd(nd); kfree(nd_get_link(nd)); unionfs_read_unlock(dentry-d_sb); } @@ -1002,6 +1007,7 @@ static int unionfs_permission(struct inode *inode, int mask, out: unionfs_check_inode(inode); + unionfs_check_nd(nd); return err; } diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index 755bc25..d27844d 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -518,6 +518,8 @@ static inline void unionfs_mntput(struct dentry *dentry, int bindex) __FILE__,__FUNCTION__,__LINE__) #define unionfs_check_file(f) __unionfs_check_file((f), \ __FILE__,__FUNCTION__,__LINE__) +#define unionfs_check_nd(n)__unionfs_check_nd((n), \ + __FILE__,__FUNCTION__,__LINE__) #define show_branch_counts(sb) __show_branch_counts((sb), \ __FILE__,__FUNCTION__,__LINE__) #define show_inode_times(i)__show_inode_times((i), \ @@ -534,6 +536,8 @@ extern void __unionfs_check_dentry(const struct dentry *dentry, int line); extern void __unionfs_check_file(const struct file *file, const char *fname, const char *fxn, int line); +extern void __unionfs_check_nd(const struct nameidata *nd, + const char *fname, const char *fxn, int line); extern void __show_branch_counts(const struct super_block *sb, const char *file, const char *fxn, int line); extern void __show_inode_times(const struct inode *inode, -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 22/25] Unionfs: add un/likely conditionals on common subr
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/sioq.c |4 ++-- fs/unionfs/subr.c | 26 +- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/fs/unionfs/sioq.c b/fs/unionfs/sioq.c index 2a8c88e..35d9fc3 100644 --- a/fs/unionfs/sioq.c +++ b/fs/unionfs/sioq.c @@ -28,7 +28,7 @@ int __init init_sioq(void) int err; superio_workqueue = create_workqueue(unionfs_siod); - if (!IS_ERR(superio_workqueue)) + if (unlikely(!IS_ERR(superio_workqueue))) return 0; err = PTR_ERR(superio_workqueue); @@ -39,7 +39,7 @@ int __init init_sioq(void) void stop_sioq(void) { - if (superio_workqueue) + if (likely(superio_workqueue)) destroy_workqueue(superio_workqueue); } diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c index 6b93b64..6067d65 100644 --- a/fs/unionfs/subr.c +++ b/fs/unionfs/subr.c @@ -40,7 +40,7 @@ int create_whiteout(struct dentry *dentry, int start) /* create dentry's whiteout equivalent */ name = alloc_whname(dentry-d_name.name, dentry-d_name.len); - if (IS_ERR(name)) { + if (unlikely(IS_ERR(name))) { err = PTR_ERR(name); goto out; } @@ -60,7 +60,7 @@ int create_whiteout(struct dentry *dentry, int start) dentry, dentry-d_name.name, bindex); - if (!lower_dentry || IS_ERR(lower_dentry)) { + if (unlikely(!lower_dentry || IS_ERR(lower_dentry))) { printk(KERN_DEBUG unionfs: create_parents failed for bindex = %d\n, bindex); continue; @@ -70,7 +70,7 @@ int create_whiteout(struct dentry *dentry, int start) lower_wh_dentry = lookup_one_len(name, lower_dentry-d_parent, dentry-d_name.len + UNIONFS_WHLEN); - if (IS_ERR(lower_wh_dentry)) + if (unlikely(IS_ERR(lower_wh_dentry))) continue; /* @@ -84,7 +84,7 @@ int create_whiteout(struct dentry *dentry, int start) } err = init_lower_nd(nd, LOOKUP_CREATE); - if (err 0) + if (unlikely(err 0)) goto out; lower_dir_dentry = lock_parent(lower_wh_dentry); if (!(err = is_robranch_super(dentry-d_sb, bindex))) @@ -96,12 +96,12 @@ int create_whiteout(struct dentry *dentry, int start) dput(lower_wh_dentry); release_lower_nd(nd, err); - if (!err || !IS_COPYUP_ERR(err)) + if (unlikely(!err || !IS_COPYUP_ERR(err))) break; } /* set dbopaque so that lookup will not proceed after this branch */ - if (!err) + if (likely(!err)) set_dbopaque(dentry, bindex); out: @@ -129,7 +129,7 @@ int unionfs_refresh_lower_dentry(struct dentry *dentry, int bindex) lower_dentry = lookup_one_len(dentry-d_name.name, lower_parent, dentry-d_name.len); - if (IS_ERR(lower_dentry)) { + if (unlikely(IS_ERR(lower_dentry))) { err = PTR_ERR(lower_dentry); goto out; } @@ -138,7 +138,7 @@ int unionfs_refresh_lower_dentry(struct dentry *dentry, int bindex) iput(unionfs_lower_inode_idx(dentry-d_inode, bindex)); unionfs_set_lower_inode_idx(dentry-d_inode, bindex, NULL); - if (!lower_dentry-d_inode) { + if (unlikely(!lower_dentry-d_inode)) { dput(lower_dentry); unionfs_set_lower_dentry_idx(dentry, bindex, NULL); } else { @@ -166,17 +166,17 @@ int make_dir_opaque(struct dentry *dentry, int bindex) mutex_lock(lower_dir-i_mutex); diropq = lookup_one_len(UNIONFS_DIR_OPAQUE, lower_dentry, sizeof(UNIONFS_DIR_OPAQUE) - 1); - if (IS_ERR(diropq)) { + if (unlikely(IS_ERR(diropq))) { err = PTR_ERR(diropq); goto out; } err = init_lower_nd(nd, LOOKUP_CREATE); - if (err 0) + if (unlikely(err 0)) goto out; if (!diropq-d_inode) err = vfs_create(lower_dir, diropq, S_IRUGO, nd); - if (!err) + if (likely(!err)) set_dbopaque(dentry, bindex); release_lower_nd(nd, err); @@ -193,7 +193,7 @@ out: int unionfs_get_nlinks(const struct inode *inode) { /* don't bother to do all the work since we're unlinked */ - if (inode-i_nlink == 0) + if (unlikely(inode-i_nlink == 0)) return 0; if (!S_ISDIR(inode-i_mode)) @@ -213,7 +213,7 @@ char
[PATCH 06/25] Unionfs: minor coding style updates
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/debug.c |6 -- fs/unionfs/dentry.c |2 +- fs/unionfs/inode.c | 14 -- fs/unionfs/main.c |4 ++-- fs/unionfs/union.h |2 +- 5 files changed, 16 insertions(+), 12 deletions(-) diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c index f678534..2d15fb0 100644 --- a/fs/unionfs/debug.c +++ b/fs/unionfs/debug.c @@ -467,7 +467,8 @@ void __show_dinode_times(const struct dentry *dentry, lower_inode = unionfs_lower_inode_idx(inode, bindex); if (!lower_inode) continue; - printk(DT(%s:%lu:%d): , dentry-d_name.name, inode-i_ino, bindex); + printk(DT(%s:%lu:%d): , dentry-d_name.name, inode-i_ino, + bindex); printk(%s:%s:%d ,file,fxn,line); printk(um=%lu/%lu lm=%lu/%lu , inode-i_mtime.tv_sec, inode-i_mtime.tv_nsec, @@ -490,7 +491,8 @@ void __show_inode_counts(const struct inode *inode, printk(SiC: Null inode\n); return; } - for (bindex=sbstart(inode-i_sb); bindex = sbend(inode-i_sb); bindex++) { + for (bindex=sbstart(inode-i_sb); bindex = sbend(inode-i_sb); +bindex++) { lower_inode = unionfs_lower_inode_idx(inode, bindex); if (!lower_inode) continue; diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c index 08b5722..d9bb199 100644 --- a/fs/unionfs/dentry.c +++ b/fs/unionfs/dentry.c @@ -26,7 +26,7 @@ * Returns true if valid, false otherwise. */ static bool __unionfs_d_revalidate_one(struct dentry *dentry, - struct nameidata *nd) + struct nameidata *nd) { bool valid = true; /* default is valid */ struct dentry *lower_dentry; diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c index 9638b64..de78e26 100644 --- a/fs/unionfs/inode.c +++ b/fs/unionfs/inode.c @@ -99,7 +99,8 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, * if lower_dentry is NULL, create the entire * dentry directory structure in branch 0. */ - lower_dentry = create_parents(parent, dentry, dentry-d_name.name, 0); + lower_dentry = create_parents(parent, dentry, + dentry-d_name.name, 0); if (IS_ERR(lower_dentry)) { err = PTR_ERR(lower_dentry); goto out; @@ -447,9 +448,8 @@ static int unionfs_symlink(struct inode *dir, struct dentry *dentry, if (!(err = is_robranch_super(dentry-d_sb, bindex))) { mode = S_IALLUGO; - err = - vfs_symlink(lower_dir_dentry-d_inode, - lower_dentry, symname, mode); + err = vfs_symlink(lower_dir_dentry-d_inode, + lower_dentry, symname, mode); } unlock_dir(lower_dir_dentry); @@ -884,9 +884,11 @@ static void unionfs_put_link(struct dentry *dentry, struct nameidata *nd, * readonly, to allow copyup to work. * (3) we do call security_inode_permission, and therefore security inside * SELinux, etc. are performed. + * + * @inode: the lower inode we're checking permission on */ -static int inode_permission(struct super_block *sb, struct inode *inode, int mask, - struct nameidata *nd, int bindex) +static int inode_permission(struct super_block *sb, struct inode *inode, + int mask, struct nameidata *nd, int bindex) { int retval, submask; diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c index 4faae44..8595750 100644 --- a/fs/unionfs/main.c +++ b/fs/unionfs/main.c @@ -275,14 +275,14 @@ int __parse_branch_mode(const char *name) */ int parse_branch_mode(const char *name) { - int perms = __parse_branch_mode(name); + int perms = __parse_branch_mode(name); if (perms == 0) perms = MAY_READ | MAY_WRITE; return perms; } -/* +/* * parse the dirs= mount argument * * We don't need to lock the superblock private data's rwsem, as we get diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index 5e9843b..755bc25 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -549,7 +549,7 @@ extern void __show_inode_counts(const struct inode *inode, /* we leave useful hooks for these check functions throughout the code */ #define unionfs_check_inode(i) do { } while(0) -#define unionfs_check_dentry(d)do { } while(0) +#define unionfs_check_dentry(d)do { } while(0) #define unionfs_check_file(f) do { } while(0) #define show_branch_counts(sb) do { }
[PATCH 23/25] Unionfs: add un/likely conditionals on unlink ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/unlink.c | 32 1 files changed, 16 insertions(+), 16 deletions(-) diff --git a/fs/unionfs/unlink.c b/fs/unionfs/unlink.c index 3924f7f..33d08d9 100644 --- a/fs/unionfs/unlink.c +++ b/fs/unionfs/unlink.c @@ -26,13 +26,13 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry) int bindex; int err = 0; - if ((err = unionfs_partial_lookup(dentry))) + if (unlikely((err = unionfs_partial_lookup(dentry goto out; bindex = dbstart(dentry); lower_dentry = unionfs_lower_dentry_idx(dentry, bindex); - if (!lower_dentry) + if (unlikely(!lower_dentry)) goto out; lower_dir_dentry = lock_parent(lower_dentry); @@ -42,13 +42,13 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry) if (!(err = is_robranch_super(dentry-d_sb, bindex))) err = vfs_unlink(lower_dir_dentry-d_inode, lower_dentry); /* if vfs_unlink succeeded, update our inode's times */ - if (!err) + if (likely(!err)) unionfs_copy_attr_times(dentry-d_inode); dput(lower_dentry); fsstack_copy_attr_times(dir, lower_dir_dentry-d_inode); unlock_dir(lower_dir_dentry); - if (err !IS_COPYUP_ERR(err)) + if (unlikely(err !IS_COPYUP_ERR(err))) goto out; if (err) { @@ -62,11 +62,11 @@ static int unionfs_unlink_whiteout(struct inode *dir, struct dentry *dentry) err = create_whiteout(dentry, dbstart(dentry)); out: - if (!err) + if (likely(!err)) dentry-d_inode-i_nlink--; /* We don't want to leave negative leftover dentries for revalidate. */ - if (!err (dbopaque(dentry) != -1)) + if (likely(!err (dbopaque(dentry) != -1))) update_bstart(dentry); return err; @@ -79,7 +79,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry) unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } @@ -87,7 +87,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry) err = unionfs_unlink_whiteout(dir, dentry); /* call d_drop so the system forgets about us */ - if (!err) { + if (likely(!err)) { if (!S_ISDIR(dentry-d_inode-i_mode)) unionfs_postcopyup_release(dentry); d_drop(dentry); @@ -99,7 +99,7 @@ int unionfs_unlink(struct inode *dir, struct dentry *dentry) } out: - if (!err) { + if (likely(!err)) { unionfs_check_dentry(dentry); unionfs_check_inode(dir); } @@ -117,7 +117,7 @@ static int unionfs_rmdir_first(struct inode *dir, struct dentry *dentry, /* Here we need to remove whiteout entries. */ err = delete_whiteouts(dentry, dbstart(dentry), namelist); - if (err) + if (unlikely(err)) goto out; lower_dentry = unionfs_lower_dentry(dentry); @@ -135,7 +135,7 @@ static int unionfs_rmdir_first(struct inode *dir, struct dentry *dentry, dentry-d_inode-i_nlink = unionfs_get_nlinks(dentry-d_inode); out: - if (lower_dir_dentry) + if (likely(lower_dir_dentry)) unlock_dir(lower_dir_dentry); return err; } @@ -148,7 +148,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry) unionfs_read_lock(dentry-d_sb); unionfs_lock_dentry(dentry); - if (!__unionfs_d_revalidate_chain(dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(dentry, NULL, false))) { err = -ESTALE; goto out; } @@ -156,7 +156,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry) /* check if this unionfs directory is empty or not */ err = check_empty(dentry, namelist); - if (err) + if (unlikely(err)) goto out; err = unionfs_rmdir_first(dir, dentry, namelist); @@ -170,7 +170,7 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry) goto out; /* exit if the error returned was NOT -EROFS */ - if (!IS_COPYUP_ERR(err)) + if (unlikely(!IS_COPYUP_ERR(err))) goto out; new_err = create_whiteout(dentry, dbstart(dentry) - 1); @@ -180,10 +180,10 @@ int unionfs_rmdir(struct inode *dir, struct dentry *dentry) out: /* call d_drop so the system forgets about us */ - if (!err) + if (likely(!err)) d_drop(dentry); - if (namelist) + if (likely(namelist))
[PATCH 15/25] Unionfs: add un/likely conditionals on fileops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/file.c | 38 +++--- 1 files changed, 19 insertions(+), 19 deletions(-) diff --git a/fs/unionfs/file.c b/fs/unionfs/file.c index d8eaaa5..06ca1fa 100644 --- a/fs/unionfs/file.c +++ b/fs/unionfs/file.c @@ -24,13 +24,13 @@ static ssize_t unionfs_read(struct file *file, char __user *buf, int err; unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, false))) + if (unlikely((err = unionfs_file_revalidate(file, false goto out; unionfs_check_file(file); err = do_sync_read(file, buf, count, ppos); - if (err = 0) + if (likely(err = 0)) touch_atime(unionfs_lower_mnt(file-f_path.dentry), unionfs_lower_dentry(file-f_path.dentry)); @@ -47,16 +47,16 @@ static ssize_t unionfs_aio_read(struct kiocb *iocb, const struct iovec *iov, struct file *file = iocb-ki_filp; unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, false))) + if (unlikely((err = unionfs_file_revalidate(file, false goto out; unionfs_check_file(file); err = generic_file_aio_read(iocb, iov, nr_segs, pos); - if (err == -EIOCBQUEUED) + if (unlikely(err == -EIOCBQUEUED)) err = wait_on_sync_kiocb(iocb); - if (err = 0) + if (likely(err = 0)) touch_atime(unionfs_lower_mnt(file-f_path.dentry), unionfs_lower_dentry(file-f_path.dentry)); @@ -72,13 +72,13 @@ static ssize_t unionfs_write(struct file *file, const char __user *buf, int err = 0; unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, true))) + if (unlikely((err = unionfs_file_revalidate(file, true goto out; unionfs_check_file(file); err = do_sync_write(file, buf, count, ppos); /* update our inode times upon a successful lower write */ - if (err = 0) { + if (likely(err = 0)) { unionfs_copy_attr_times(file-f_path.dentry-d_inode); unionfs_check_file(file); } @@ -104,7 +104,7 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma) /* This might be deferred to mmap's writepage */ willwrite = ((vma-vm_flags | VM_SHARED | VM_WRITE) == vma-vm_flags); - if ((err = unionfs_file_revalidate(file, willwrite))) + if (unlikely((err = unionfs_file_revalidate(file, willwrite goto out; unionfs_check_file(file); @@ -119,19 +119,19 @@ static int unionfs_mmap(struct file *file, struct vm_area_struct *vma) * generic_file_readonly_mmap returns in that case). */ lower_file = unionfs_lower_file(file); - if (willwrite !lower_file-f_mapping-a_ops-writepage) { + if (unlikely(willwrite !lower_file-f_mapping-a_ops-writepage)) { err = -EINVAL; printk(unionfs: branch %d file system does not support writeable mmap\n, fbstart(file)); } else { err = generic_file_mmap(file, vma); - if (err) + if (unlikely(err)) printk(unionfs: generic_file_mmap failed %d\n, err); } out: unionfs_read_unlock(file-f_path.dentry-d_sb); - if (!err) { + if (likely(!err)) { /* copyup could cause parent dir times to change */ unionfs_copy_attr_times(file-f_path.dentry-d_parent-d_inode); unionfs_check_file(file); @@ -149,7 +149,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync) int err = -EINVAL; unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, true))) + if (unlikely((err = unionfs_file_revalidate(file, true goto out; unionfs_check_file(file); @@ -159,14 +159,14 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync) goto out; inode = dentry-d_inode; - if (!inode) { + if (unlikely(!inode)) { printk(KERN_ERR unionfs: null lower inode in unionfs_fsync\n); goto out; } for (bindex = bstart; bindex = bend; bindex++) { lower_inode = unionfs_lower_inode_idx(inode, bindex); - if (!lower_inode || !lower_inode-i_fop-fsync) + if (unlikely(!lower_inode || !lower_inode-i_fop-fsync)) continue; lower_file = unionfs_lower_file_idx(file, bindex); lower_dentry = unionfs_lower_dentry_idx(dentry, bindex); @@ -175,7 +175,7 @@ int unionfs_fsync(struct file *file, struct dentry *dentry, int datasync)
[PATCH 17/25] Unionfs: add un/likely conditionals on lookup ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/lookup.c | 44 ++-- 1 files changed, 22 insertions(+), 22 deletions(-) diff --git a/fs/unionfs/lookup.c b/fs/unionfs/lookup.c index 2109714..92b5e0a 100644 --- a/fs/unionfs/lookup.c +++ b/fs/unionfs/lookup.c @@ -59,7 +59,7 @@ static noinline int is_opaque_dir(struct dentry *dentry, int bindex) mutex_unlock(lower_inode-i_mutex); - if (IS_ERR(wh_lower_dentry)) { + if (unlikely(IS_ERR(wh_lower_dentry))) { err = PTR_ERR(wh_lower_dentry); goto out; } @@ -119,12 +119,12 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, case INTERPOSE_PARTIAL: break; case INTERPOSE_LOOKUP: - if ((err = new_dentry_private_data(dentry))) + if (unlikely((err = new_dentry_private_data(dentry goto out; break; default: /* default: can only be INTERPOSE_REVAL/REVAL_NEG */ - if ((err = realloc_dentry_private_data(dentry))) + if (unlikely((err = realloc_dentry_private_data(dentry goto out; break; } @@ -147,7 +147,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, namelen = dentry-d_name.len; /* No dentries should get created for possible whiteout names. */ - if (!is_validname(name)) { + if (unlikely(!is_validname(name))) { err = -EPERM; goto out_free; } @@ -179,7 +179,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, unionfs_lower_dentry_idx(parent_dentry, bindex); /* if the parent lower dentry does not exist skip this */ - if (!(lower_dir_dentry lower_dir_dentry-d_inode)) + if (unlikely(!(lower_dir_dentry lower_dir_dentry-d_inode))) continue; /* also skip it if the parent isn't a directory. */ @@ -189,7 +189,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, /* Reuse the whiteout name because its value doesn't change. */ if (!whname) { whname = alloc_whname(name, namelen); - if (IS_ERR(whname)) { + if (unlikely(IS_ERR(whname))) { err = PTR_ERR(whname); goto out_free; } @@ -198,7 +198,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, /* check if whiteout exists in this branch: lookup .wh.foo */ wh_lower_dentry = lookup_one_len(whname, lower_dir_dentry, namelen + UNIONFS_WHLEN); - if (IS_ERR(wh_lower_dentry)) { + if (unlikely(IS_ERR(wh_lower_dentry))) { dput(first_lower_dentry); unionfs_mntput(first_dentry, first_dentry_offset); err = PTR_ERR(wh_lower_dentry); @@ -207,7 +207,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, if (wh_lower_dentry-d_inode) { /* We found a whiteout so lets give up. */ - if (S_ISREG(wh_lower_dentry-d_inode-i_mode)) { + if (likely(S_ISREG(wh_lower_dentry-d_inode-i_mode))) { set_dbend(dentry, bindex); set_dbopaque(dentry, bindex); dput(wh_lower_dentry); @@ -228,7 +228,7 @@ struct dentry *unionfs_lookup_backend(struct dentry *dentry, /* Now do regular lookup; lookup foo */ lower_dentry = lookup_one_len(name, lower_dir_dentry, namelen); - if (IS_ERR(lower_dentry)) { + if (unlikely(IS_ERR(lower_dentry))) { dput(first_lower_dentry); unionfs_mntput(first_dentry, first_dentry_offset); err = PTR_ERR(lower_dentry); @@ -321,7 +321,7 @@ out_negative: first_lower_dentry = lookup_one_len(name, lower_dir_dentry, namelen); first_dentry_offset = bindex; - if (IS_ERR(first_lower_dentry)) { + if (unlikely(IS_ERR(first_lower_dentry))) { err = PTR_ERR(first_lower_dentry); goto out; } @@ -381,12 +381,12 @@ out_positive: * dentry. */ d_interposed = unionfs_interpose(dentry, dentry-d_sb, lookupmode); - if (IS_ERR(d_interposed)) + if (unlikely(IS_ERR(d_interposed))) err = PTR_ERR(d_interposed); else if (d_interposed) dentry = d_interposed; - if (err) + if
[PATCH 13/25] Unionfs: add un/likely conditionals on dir ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/dirfops.c | 22 +++--- fs/unionfs/dirhelper.c | 30 +++--- 2 files changed, 26 insertions(+), 26 deletions(-) diff --git a/fs/unionfs/dirfops.c b/fs/unionfs/dirfops.c index c923e58..fa2df88 100644 --- a/fs/unionfs/dirfops.c +++ b/fs/unionfs/dirfops.c @@ -63,7 +63,7 @@ static int unionfs_filldir(void *dirent, const char *name, int namelen, off_t pos = rdstate2offset(buf-rdstate); u64 unionfs_ino = ino; - if (!err) { + if (likely(!err)) { err = buf-filldir(buf-dirent, name, namelen, pos, unionfs_ino, d_type); buf-rdstate-offset++; @@ -74,7 +74,7 @@ static int unionfs_filldir(void *dirent, const char *name, int namelen, * If we did fill it, stuff it in our hash, otherwise return an * error. */ - if (err) { + if (unlikely(err)) { buf-filldir_error = err; goto out; } @@ -99,7 +99,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, false))) + if (unlikely((err = unionfs_file_revalidate(file, false goto out; inode = file-f_path.dentry-d_inode; @@ -110,7 +110,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) goto out; } else if (file-f_pos 0) { uds = find_rdstate(inode, file-f_pos); - if (!uds) { + if (unlikely(!uds)) { err = -ESTALE; goto out; } @@ -124,7 +124,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) while (uds-bindex = bend) { lower_file = unionfs_lower_file_idx(file, uds-bindex); - if (!lower_file) { + if (unlikely(!lower_file)) { uds-bindex++; uds-dirpos = 0; continue; @@ -141,7 +141,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) /* Read starting from where we last left off. */ offset = vfs_llseek(lower_file, uds-dirpos, SEEK_SET); - if (offset 0) { + if (unlikely(offset 0)) { err = offset; goto out; } @@ -149,7 +149,7 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) /* Save the position for when we continue. */ offset = vfs_llseek(lower_file, 0, SEEK_CUR); - if (offset 0) { + if (unlikely(offset 0)) { err = offset; goto out; } @@ -158,10 +158,10 @@ static int unionfs_readdir(struct file *file, void *dirent, filldir_t filldir) /* Copy the atime. */ fsstack_copy_attr_atime(inode, lower_file-f_path.dentry-d_inode); - if (err 0) + if (unlikely(err 0)) goto out; - if (buf.filldir_error) + if (unlikely(buf.filldir_error)) break; if (!buf.entries_written) { @@ -201,7 +201,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t offset, int origin) unionfs_read_lock(file-f_path.dentry-d_sb); - if ((err = unionfs_file_revalidate(file, false))) + if (unlikely((err = unionfs_file_revalidate(file, false goto out; rdstate = UNIONFS_F(file)-rdstate; @@ -241,7 +241,7 @@ static loff_t unionfs_dir_llseek(struct file *file, loff_t offset, int origin) } else { rdstate = find_rdstate(file-f_path.dentry-d_inode, offset); - if (rdstate) { + if (likely(rdstate)) { UNIONFS_F(file)-rdstate = rdstate; err = rdstate-offset; } else diff --git a/fs/unionfs/dirhelper.c b/fs/unionfs/dirhelper.c index a72f711..d481ba4 100644 --- a/fs/unionfs/dirhelper.c +++ b/fs/unionfs/dirhelper.c @@ -43,7 +43,7 @@ int do_delete_whiteouts(struct dentry *dentry, int bindex, err = -ENOMEM; name = __getname(); - if (!name) + if (unlikely(!name)) goto out; strcpy(name, UNIONFS_WHPFX); p = name + UNIONFS_WHLEN; @@ -65,14 +65,14 @@ int do_delete_whiteouts(struct dentry *dentry,
[PATCH 12/25] Unionfs: add un/likely conditionals on dentry ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/dentry.c | 68 ++ 1 files changed, 35 insertions(+), 33 deletions(-) diff --git a/fs/unionfs/dentry.c b/fs/unionfs/dentry.c index 52bcb18..3f3a18d 100644 --- a/fs/unionfs/dentry.c +++ b/fs/unionfs/dentry.c @@ -45,7 +45,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, verify_locked(dentry); /* if the dentry is unhashed, do NOT revalidate */ - if (d_deleted(dentry)) { + if (unlikely(d_deleted(dentry))) { dprintk(KERN_DEBUG unionfs: unhashed dentry being revalidated: %*s\n, dentry-d_name.len, dentry-d_name.name); @@ -53,7 +53,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, } BUG_ON(dbstart(dentry) == -1); - if (dentry-d_inode) + if (likely(dentry-d_inode)) positive = 1; dgen = atomic_read(UNIONFS_D(dentry)-generation); sbgen = atomic_read(UNIONFS_SB(dentry-d_sb)-generation); @@ -62,7 +62,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, * revalidation to be done, because this file does not exist within * the namespace, and Unionfs operates on the namespace, not data. */ - if (sbgen != dgen) { + if (unlikely(sbgen != dgen)) { struct dentry *result; int pdgen; @@ -76,7 +76,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, /* Free the pointers for our inodes and this dentry. */ bstart = dbstart(dentry); bend = dbend(dentry); - if (bstart = 0) { + if (likely(bstart = 0)) { struct dentry *lower_dentry; for (bindex = bstart; bindex = bend; bindex++) { lower_dentry = @@ -89,7 +89,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, set_dbend(dentry, -1); interpose_flag = INTERPOSE_REVAL_NEG; - if (positive) { + if (likely(positive)) { interpose_flag = INTERPOSE_REVAL; /* * During BRM, the VFS could already hold a lock on @@ -97,14 +97,14 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, * (deadlock), but if you lock it in this function, * then release it here too. */ - if (!mutex_is_locked(dentry-d_inode-i_mutex)) { + if (unlikely(!mutex_is_locked(dentry-d_inode-i_mutex))) { mutex_lock(dentry-d_inode-i_mutex); locked = 1; } bstart = ibstart(dentry-d_inode); bend = ibend(dentry-d_inode); - if (bstart = 0) { + if (likely(bstart = 0)) { struct inode *lower_inode; for (bindex = bstart; bindex = bend; bindex++) { @@ -119,14 +119,14 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, UNIONFS_I(dentry-d_inode)-lower_inodes = NULL; ibstart(dentry-d_inode) = -1; ibend(dentry-d_inode) = -1; - if (locked) + if (unlikely(locked)) mutex_unlock(dentry-d_inode-i_mutex); } result = unionfs_lookup_backend(dentry, lowernd, interpose_flag); - if (result) { - if (IS_ERR(result)) { + if (likely(result)) { + if (unlikely(IS_ERR(result))) { valid = false; goto out; } @@ -138,7 +138,7 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, dentry = result; } - if (positive UNIONFS_I(dentry-d_inode)-stale) { + if (unlikely(positive UNIONFS_I(dentry-d_inode)-stale)) { make_bad_inode(dentry-d_inode); d_drop(dentry); valid = false; @@ -153,8 +153,8 @@ static bool __unionfs_d_revalidate_one(struct dentry *dentry, BUG_ON(bstart == -1); for (bindex = bstart; bindex = bend; bindex++) { lower_dentry = unionfs_lower_dentry_idx(dentry, bindex); - if (!lower_dentry || !lower_dentry-d_op - || !lower_dentry-d_op-d_revalidate) + if (unlikely(!lower_dentry || !lower_dentry-d_op +||
[PATCH 20/25] Unionfs: add un/likely conditionals on rename ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/rename.c | 96 +- 1 files changed, 48 insertions(+), 48 deletions(-) diff --git a/fs/unionfs/rename.c b/fs/unionfs/rename.c index 7b8fe39..92c4515 100644 --- a/fs/unionfs/rename.c +++ b/fs/unionfs/rename.c @@ -39,7 +39,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, create_parents(new_dentry-d_parent-d_inode, new_dentry, new_dentry-d_name.name, bindex); - if (IS_ERR(lower_new_dentry)) { + if (unlikely(IS_ERR(lower_new_dentry))) { printk(KERN_DEBUG unionfs: error creating directory tree for rename, bindex = %d, err = %ld\n, bindex, PTR_ERR(lower_new_dentry)); @@ -50,7 +50,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, wh_name = alloc_whname(new_dentry-d_name.name, new_dentry-d_name.len); - if (IS_ERR(wh_name)) { + if (unlikely(IS_ERR(wh_name))) { err = PTR_ERR(wh_name); goto out; } @@ -58,14 +58,14 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, lower_wh_dentry = lookup_one_len(wh_name, lower_new_dentry-d_parent, new_dentry-d_name.len + UNIONFS_WHLEN); - if (IS_ERR(lower_wh_dentry)) { + if (unlikely(IS_ERR(lower_wh_dentry))) { err = PTR_ERR(lower_wh_dentry); goto out; } if (lower_wh_dentry-d_inode) { /* get rid of the whiteout that is existing */ - if (lower_new_dentry-d_inode) { + if (unlikely(lower_new_dentry-d_inode)) { printk(KERN_WARNING unionfs: both a whiteout and a dentry exist when doing a rename!\n); err = -EIO; @@ -81,7 +81,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, dput(lower_wh_dentry); unlock_dir(lower_wh_dir_dentry); - if (err) + if (unlikely(err)) goto out; } else dput(lower_wh_dentry); @@ -93,7 +93,7 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, lock_rename(lower_old_dir_dentry, lower_new_dir_dentry); err = is_robranch_super(old_dentry-d_sb, bindex); - if (err) + if (unlikely(err)) goto out_unlock; /* @@ -105,14 +105,14 @@ static int __unionfs_rename(struct inode *old_dir, struct dentry *old_dentry, whname = alloc_whname(old_dentry-d_name.name, old_dentry-d_name.len); err = PTR_ERR(whname); - if (IS_ERR(whname)) + if (unlikely(IS_ERR(whname))) goto out_unlock; *wh_old = lookup_one_len(whname, lower_old_dir_dentry, old_dentry-d_name.len + UNIONFS_WHLEN); kfree(whname); err = PTR_ERR(*wh_old); - if (IS_ERR(*wh_old)) { + if (unlikely(IS_ERR(*wh_old))) { *wh_old = NULL; goto out_unlock; } @@ -129,7 +129,7 @@ out_unlock: dput(lower_old_dentry); out: - if (!err) { + if (likely(!err)) { /* Fixup the new_dentry. */ if (bindex dbstart(new_dentry)) set_dbstart(new_dentry, bindex); @@ -174,8 +174,8 @@ static int do_unionfs_rename(struct inode *old_dir, /* Rename source to destination. */ err = __unionfs_rename(old_dir, old_dentry, new_dir, new_dentry, old_bstart, wh_old); - if (err) { - if (!IS_COPYUP_ERR(err)) + if (unlikely(err)) { + if (unlikely(!IS_COPYUP_ERR(err))) goto out; do_copyup = old_bstart - 1; } else @@ -190,7 +190,7 @@ static int do_unionfs_rename(struct inode *old_dir, struct dentry *unlink_dir_dentry; unlink_dentry = unionfs_lower_dentry_idx(new_dentry, bindex); - if (!unlink_dentry) + if (unlikely(!unlink_dentry)) continue; unlink_dir_dentry = lock_parent(unlink_dentry); @@ -205,15 +205,15 @@ static int do_unionfs_rename(struct inode *old_dir, unionfs_get_nlinks(new_dentry-d_parent-d_inode); unlock_dir(unlink_dir_dentry); - if (!err) { +
[PATCH 09/25] Unionfs: add un/likely conditionals on common fileops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/commonfops.c | 94 +++--- 1 files changed, 47 insertions(+), 47 deletions(-) diff --git a/fs/unionfs/commonfops.c b/fs/unionfs/commonfops.c index e69ccf6..db8f064 100644 --- a/fs/unionfs/commonfops.c +++ b/fs/unionfs/commonfops.c @@ -64,7 +64,7 @@ retry: tmp_dentry = lookup_one_len(name, lower_dentry-d_parent, nlen); - if (IS_ERR(tmp_dentry)) { + if (unlikely(IS_ERR(tmp_dentry))) { err = PTR_ERR(tmp_dentry); goto out; } @@ -73,8 +73,8 @@ retry: err = copyup_named_file(dentry-d_parent-d_inode, file, name, bstart, bindex, file-f_path.dentry-d_inode-i_size); - if (err) { - if (err == -EEXIST) + if (unlikely(err)) { + if (unlikely(err == -EEXIST)) goto retry; goto out; } @@ -91,7 +91,7 @@ retry: unlock_dir(lower_dir_dentry); out: - if (!err) + if (likely(!err)) unionfs_check_dentry(dentry); return err; } @@ -126,7 +126,7 @@ static void cleanup_file(struct file *file) */ old_bid = UNIONFS_F(file)-saved_branch_ids[bindex]; i = branch_id_to_idx(sb, old_bid); - if (i 0) { + if (unlikely(i 0)) { printk(KERN_ERR unionfs: no superblock for file %p\n, file); continue; @@ -179,7 +179,7 @@ static int open_all_files(struct file *file) dentry_open(lower_dentry, unionfs_lower_mnt_idx(dentry, bindex), file-f_flags); - if (IS_ERR(lower_file)) { + if (unlikely(IS_ERR(lower_file))) { err = PTR_ERR(lower_file); goto out; } else @@ -208,7 +208,7 @@ static int open_highest_file(struct file *file, bool willwrite) for (bindex = bstart - 1; bindex = 0; bindex--) { err = copyup_file(parent_inode, file, bstart, bindex, inode_size); - if (!err) + if (likely(!err)) break; } atomic_set(UNIONFS_F(file)-generation, @@ -222,7 +222,7 @@ static int open_highest_file(struct file *file, bool willwrite) lower_file = dentry_open(lower_dentry, unionfs_lower_mnt_idx(dentry, bstart), file-f_flags); - if (IS_ERR(lower_file)) { + if (unlikely(IS_ERR(lower_file))) { err = PTR_ERR(lower_file); goto out; } @@ -252,17 +252,17 @@ static int do_delayed_copyup(struct file *file) unionfs_check_file(file); unionfs_check_dentry(dentry); for (bindex = bstart - 1; bindex = 0; bindex--) { - if (!d_deleted(dentry)) + if (likely(!d_deleted(dentry))) err = copyup_file(parent_inode, file, bstart, bindex, inode_size); else err = copyup_deleted_file(file, dentry, bstart, bindex); - if (!err) + if (likely(!err)) break; } - if (err || (bstart = fbstart(file))) + if (unlikely(err || (bstart = fbstart(file goto out; bend = fbend(file); for (bindex = bstart; bindex = bend; bindex++) { @@ -317,8 +317,8 @@ int unionfs_file_revalidate(struct file *file, bool willwrite) * First revalidate the dentry inside struct file, * but not unhashed dentries. */ - if (!d_deleted(dentry) - !__unionfs_d_revalidate_chain(dentry, NULL, willwrite)) { + if (unlikely(!d_deleted(dentry) +!__unionfs_d_revalidate_chain(dentry, NULL, willwrite))) { err = -ESTALE; goto out_nofree; } @@ -335,8 +335,8 @@ int unionfs_file_revalidate(struct file *file, bool willwrite) * someone has copied up this file from underneath us, we also need * to refresh things. */ - if (!d_deleted(dentry) - (sbgen fgen || dbstart(dentry) != fbstart(file))) { + if (unlikely(!d_deleted(dentry) +(sbgen fgen || dbstart(dentry) != fbstart(file { /* save orig branch ID */ int orig_brid = UNIONFS_F(file)-saved_branch_ids[fbstart(file)]; @@ -349,13 +349,13 @@ int unionfs_file_revalidate(struct file *file, bool willwrite)
[PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/copyup.c | 102 +- 1 files changed, 51 insertions(+), 51 deletions(-) diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c index 23ac4c8..e3c5f15 100644 --- a/fs/unionfs/copyup.c +++ b/fs/unionfs/copyup.c @@ -36,14 +36,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry, /* query the actual size of the xattr list */ list_size = vfs_listxattr(old_lower_dentry, NULL, 0); - if (list_size = 0) { + if (unlikely(list_size = 0)) { err = list_size; goto out; } /* allocate space for the actual list */ name_list = unionfs_xattr_alloc(list_size + 1, XATTR_LIST_MAX); - if (!name_list || IS_ERR(name_list)) { + if (unlikely(!name_list || IS_ERR(name_list))) { err = PTR_ERR(name_list); goto out; } @@ -52,14 +52,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry, /* now get the actual xattr list of the source file */ list_size = vfs_listxattr(old_lower_dentry, name_list, list_size); - if (list_size = 0) { + if (unlikely(list_size = 0)) { err = list_size; goto out; } /* allocate space to hold each xattr's value */ attr_value = unionfs_xattr_alloc(XATTR_SIZE_MAX, XATTR_SIZE_MAX); - if (!attr_value || IS_ERR(attr_value)) { + if (unlikely(!attr_value || IS_ERR(attr_value))) { err = PTR_ERR(name_list); goto out; } @@ -73,11 +73,11 @@ static int copyup_xattrs(struct dentry *old_lower_dentry, size = vfs_getxattr(old_lower_dentry, name_list, attr_value, XATTR_SIZE_MAX); mutex_unlock(old_lower_dentry-d_inode-i_mutex); - if (size 0) { + if (unlikely(size 0)) { err = size; goto out; } - if (size XATTR_SIZE_MAX) { + if (unlikely(size XATTR_SIZE_MAX)) { err = -E2BIG; goto out; } @@ -91,13 +91,13 @@ static int copyup_xattrs(struct dentry *old_lower_dentry, * temporarily get FOWNER privileges. * XXX: move entire copyup code to SIOQ. */ - if (err == -EPERM !capable(CAP_FOWNER)) { + if (unlikely(err == -EPERM !capable(CAP_FOWNER))) { cap_raise(current-cap_effective, CAP_FOWNER); err = vfs_setxattr(new_lower_dentry, name_list, attr_value, size, 0); cap_lower(current-cap_effective, CAP_FOWNER); } - if (err 0) + if (unlikely(err 0)) goto out; name_list += strlen(name_list) + 1; } @@ -105,7 +105,7 @@ out: unionfs_xattr_kfree(name_list_buf); unionfs_xattr_kfree(attr_value); /* Ignore if xattr isn't supported */ - if (err == -ENOTSUPP || err == -EOPNOTSUPP) + if (unlikely(err == -ENOTSUPP || err == -EOPNOTSUPP)) err = 0; return err; } @@ -136,15 +136,15 @@ static int copyup_permissions(struct super_block *sb, ATTR_ATIME_SET | ATTR_MTIME_SET | ATTR_FORCE | ATTR_GID | ATTR_UID; err = notify_change(new_lower_dentry, newattrs); - if (err) + if (unlikely(err)) goto out; /* now try to change the mode and ignore EOPNOTSUPP on symlinks */ newattrs.ia_mode = i-i_mode; newattrs.ia_valid = ATTR_MODE | ATTR_FORCE; err = notify_change(new_lower_dentry, newattrs); - if (err == -EOPNOTSUPP - S_ISLNK(new_lower_dentry-d_inode-i_mode)) { + if (unlikely(err == -EOPNOTSUPP +S_ISLNK(new_lower_dentry-d_inode-i_mode))) { printk(KERN_WARNING unionfs: changing \%s\ symlink mode unsupported\n, new_lower_dentry-d_name.name); @@ -178,7 +178,7 @@ static int __copyup_ndentry(struct dentry *old_lower_dentry, run_sioq(__unionfs_mkdir, args); err = args.err; - } else if (S_ISLNK(old_mode)) { + } else if (unlikely(S_ISLNK(old_mode))) { args.symlink.parent = new_lower_parent_dentry-d_inode; args.symlink.dentry = new_lower_dentry; args.symlink.symbuf = symbuf; @@ -186,8 +186,8 @@ static int __copyup_ndentry(struct dentry *old_lower_dentry, run_sioq(__unionfs_symlink, args); err = args.err; - } else if (S_ISBLK(old_mode) || S_ISCHR(old_mode) || - S_ISFIFO(old_mode) || S_ISSOCK(old_mode)) { + } else if
[PATCH 16/25] Unionfs: add un/likely conditionals on inode ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/inode.c | 160 ++-- 1 files changed, 80 insertions(+), 80 deletions(-) diff --git a/fs/unionfs/inode.c b/fs/unionfs/inode.c index 7ee4760..7ae4a25 100644 --- a/fs/unionfs/inode.c +++ b/fs/unionfs/inode.c @@ -35,7 +35,7 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, unionfs_lock_dentry(dentry-d_parent); valid = __unionfs_d_revalidate_chain(dentry-d_parent, nd, false); unionfs_unlock_dentry(dentry-d_parent); - if (!valid) { + if (unlikely(!valid)) { err = -ESTALE; /* same as what real_lookup does */ goto out; } @@ -60,26 +60,26 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, * We _always_ create on branch 0 */ lower_dentry = unionfs_lower_dentry_idx(dentry, 0); - if (lower_dentry) { + if (likely(lower_dentry)) { /* * check if whiteout exists in this branch, i.e. lookup .wh.foo * first. */ name = alloc_whname(dentry-d_name.name, dentry-d_name.len); - if (IS_ERR(name)) { + if (unlikely(IS_ERR(name))) { err = PTR_ERR(name); goto out; } wh_dentry = lookup_one_len(name, lower_dentry-d_parent, dentry-d_name.len + UNIONFS_WHLEN); - if (IS_ERR(wh_dentry)) { + if (unlikely(IS_ERR(wh_dentry))) { err = PTR_ERR(wh_dentry); wh_dentry = NULL; goto out; } - if (wh_dentry-d_inode) { + if (unlikely(wh_dentry-d_inode)) { /* * .wh.foo has been found, so let's unlink it */ @@ -89,7 +89,7 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, err = vfs_unlink(lower_dir_dentry-d_inode, wh_dentry); unlock_dir(lower_dir_dentry); - if (err) { + if (unlikely(err)) { printk(unionfs_create: could not unlink whiteout, err = %d\n, err); goto out; @@ -102,28 +102,28 @@ static int unionfs_create(struct inode *parent, struct dentry *dentry, */ lower_dentry = create_parents(parent, dentry, dentry-d_name.name, 0); - if (IS_ERR(lower_dentry)) { + if (unlikely(IS_ERR(lower_dentry))) { err = PTR_ERR(lower_dentry); goto out; } } lower_parent_dentry = lock_parent(lower_dentry); - if (IS_ERR(lower_parent_dentry)) { + if (unlikely(IS_ERR(lower_parent_dentry))) { err = PTR_ERR(lower_parent_dentry); goto out; } err = init_lower_nd(lower_nd, LOOKUP_CREATE); - if (err 0) + if (unlikely(err 0)) goto out; err = vfs_create(lower_parent_dentry-d_inode, lower_dentry, mode, lower_nd); release_lower_nd(lower_nd, err); - if (!err) { + if (likely(!err)) { err = PTR_ERR(unionfs_interpose(dentry, parent-i_sb, 0)); - if (!err) { + if (likely(!err)) { unionfs_copy_attr_times(parent); fsstack_copy_inode_size(parent, lower_parent_dentry-d_inode); @@ -138,13 +138,13 @@ out: dput(wh_dentry); kfree(name); - if (!err) + if (likely(!err)) unionfs_postcopyup_setmnt(dentry); unionfs_unlock_dentry(dentry); unionfs_read_unlock(dentry-d_sb); unionfs_check_inode(parent); - if (!err) { + if (likely(!err)) { unionfs_check_dentry(dentry-d_parent); unionfs_check_nd(nd); } @@ -183,7 +183,7 @@ static struct dentry *unionfs_lookup(struct inode *parent, nd-dentry = path_save.dentry; nd-mnt = path_save.mnt; } - if (!IS_ERR(ret)) { + if (likely(!IS_ERR(ret))) { if (ret) dentry = ret; /* parent times may have changed */ @@ -213,12 +213,12 @@ static int unionfs_link(struct dentry *old_dentry, struct inode *dir, unionfs_read_lock(old_dentry-d_sb); unionfs_double_lock_dentry(new_dentry, old_dentry); - if (!__unionfs_d_revalidate_chain(old_dentry, NULL, false)) { + if (unlikely(!__unionfs_d_revalidate_chain(old_dentry, NULL,
[PATCH 18/25] Unionfs: add un/likely conditionals on super ops
Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/main.c | 98 ++- fs/unionfs/super.c | 90 2 files changed, 95 insertions(+), 93 deletions(-) diff --git a/fs/unionfs/main.c b/fs/unionfs/main.c index 8595750..82cb35a 100644 --- a/fs/unionfs/main.c +++ b/fs/unionfs/main.c @@ -32,13 +32,13 @@ static void unionfs_fill_inode(struct dentry *dentry, for (bindex = bstart; bindex = bend; bindex++) { lower_dentry = unionfs_lower_dentry_idx(dentry, bindex); - if (!lower_dentry) { + if (unlikely(!lower_dentry)) { unionfs_set_lower_inode_idx(inode, bindex, NULL); continue; } /* Initialize the lower inode to the new lower inode. */ - if (!lower_dentry-d_inode) + if (unlikely(!lower_dentry-d_inode)) continue; unionfs_set_lower_inode_idx(inode, bindex, @@ -52,7 +52,7 @@ static void unionfs_fill_inode(struct dentry *dentry, lower_inode = unionfs_lower_inode(inode); /* Use different set of inode ops for symlinks directories */ - if (S_ISLNK(lower_inode-i_mode)) + if (unlikely(S_ISLNK(lower_inode-i_mode))) inode-i_op = unionfs_symlink_iops; else if (S_ISDIR(lower_inode-i_mode)) inode-i_op = unionfs_dir_iops; @@ -62,8 +62,10 @@ static void unionfs_fill_inode(struct dentry *dentry, inode-i_fop = unionfs_dir_fops; /* properly initialize special inodes */ - if (S_ISBLK(lower_inode-i_mode) || S_ISCHR(lower_inode-i_mode) || - S_ISFIFO(lower_inode-i_mode) || S_ISSOCK(lower_inode-i_mode)) + if (unlikely(S_ISBLK(lower_inode-i_mode) || +S_ISCHR(lower_inode-i_mode) || +S_ISFIFO(lower_inode-i_mode) || +S_ISSOCK(lower_inode-i_mode))) init_special_inode(inode, lower_inode-i_mode, lower_inode-i_rdev); @@ -122,14 +124,14 @@ struct dentry *unionfs_interpose(struct dentry *dentry, struct super_block *sb, UNIONFS_I(inode)-lower_inodes = kcalloc(sbmax(sb), sizeof(struct inode *), GFP_KERNEL); - if (!UNIONFS_I(inode)-lower_inodes) { + if (unlikely(!UNIONFS_I(inode)-lower_inodes)) { err = -ENOMEM; goto out; } } else { /* get unique inode number for unionfs */ inode = iget(sb, iunique(sb, UNIONFS_ROOT_INO)); - if (!inode) { + if (unlikely(!inode)) { err = -EACCES; goto out; } @@ -149,7 +151,7 @@ skip: break; case INTERPOSE_LOOKUP: spliced = d_splice_alias(inode, dentry); - if (IS_ERR(spliced)) + if (unlikely(IS_ERR(spliced))) err = PTR_ERR(spliced); else if (spliced spliced != dentry) { /* @@ -181,7 +183,7 @@ skip: goto out; out_spliced: - if (!err) + if (likely(!err)) return spliced; out: return ERR_PTR(err); @@ -203,12 +205,12 @@ void unionfs_reinterpose(struct dentry *dentry) bend = dbend(dentry); for (bindex = bstart; bindex = bend; bindex++) { lower_dentry = unionfs_lower_dentry_idx(dentry, bindex); - if (!lower_dentry) + if (unlikely(!lower_dentry)) continue; - if (!lower_dentry-d_inode) + if (unlikely(!lower_dentry-d_inode)) continue; - if (unionfs_lower_inode_idx(inode, bindex)) + if (unlikely(unionfs_lower_inode_idx(inode, bindex))) continue; unionfs_set_lower_inode_idx(inode, bindex, igrab(lower_dentry-d_inode)); @@ -227,11 +229,11 @@ void unionfs_reinterpose(struct dentry *dentry) int check_branch(struct nameidata *nd) { /* XXX: remove in ODF code -- stacking unions allowed there */ - if (!strcmp(nd-dentry-d_sb-s_type-name, unionfs)) + if (unlikely(!strcmp(nd-dentry-d_sb-s_type-name, unionfs))) return -EINVAL; - if (!nd-dentry-d_inode) + if (unlikely(!nd-dentry-d_inode)) return -ENOENT; - if (!S_ISDIR(nd-dentry-d_inode-i_mode)) + if (unlikely(!S_ISDIR(nd-dentry-d_inode-i_mode))) return -ENOTDIR; return 0; } @@ -245,7 +247,7 @@ static int is_branch_overlap(struct dentry *dent1, struct dentry *dent2) while ((dent != dent2) (dent-d_parent != dent)) dent = dent-d_parent; - if
[PATCH 02/25] Unionfs: Remove unused #defines
From: Josef 'Jeff' Sipek [EMAIL PROTECTED] Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/union.h |4 1 files changed, 0 insertions(+), 4 deletions(-) diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index 1cb2e1d..140b8ae 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -437,10 +437,6 @@ static inline int is_robranch(const struct dentry *dentry) #define UNIONFS_DIR_OPAQUE_NAME __dir_opaque #define UNIONFS_DIR_OPAQUE UNIONFS_WHPFX UNIONFS_DIR_OPAQUE_NAME -#ifndef DEFAULT_POLLMASK -#define DEFAULT_POLLMASK (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM) -#endif /* not DEFAULT_POLLMASK */ - /* * EXTERNALS: */ -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 04/25] Unionfs: cache-coherency fixes
From: Olivier Blin [EMAIL PROTECTED] Do not update mtime if there is no upper branch for the inode. This prevents from calling unionfs_lower_inode_idx() with a negative index, which triggers a bug. Signed-off-by: Olivier Blin [EMAIL PROTECTED] Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/fanout.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/unionfs/fanout.h b/fs/unionfs/fanout.h index afeb9f6..51aa0de 100644 --- a/fs/unionfs/fanout.h +++ b/fs/unionfs/fanout.h @@ -308,7 +308,7 @@ static inline void unionfs_copy_attr_times(struct inode *upper) int bindex; struct inode *lower; - if (!upper) + if (!upper || ibstart(upper) 0) return; for (bindex=ibstart(upper); bindex = ibend(upper); bindex++) { lower = unionfs_lower_inode_idx(upper, bindex); -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/25] Unionfs: Simplify unionfs_get_nlinks
From: Josef 'Jeff' Sipek [EMAIL PROTECTED] Since we set the right value for d_type in readdir, there's really no point in having to calculate the number of directory links. Some on-disk filesystems don't even store the number of links for directories. Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/subr.c | 41 +++-- 1 files changed, 7 insertions(+), 34 deletions(-) diff --git a/fs/unionfs/subr.c b/fs/unionfs/subr.c index b7e7904..6b93b64 100644 --- a/fs/unionfs/subr.c +++ b/fs/unionfs/subr.c @@ -188,16 +188,10 @@ out: } /* - * returns the sum of the n_link values of all the underlying inodes of the - * passed inode + * returns the right n_link value based on the inode type */ int unionfs_get_nlinks(const struct inode *inode) { - int sum_nlinks = 0; - int dirs = 0; - int bindex; - struct inode *lower_inode; - /* don't bother to do all the work since we're unlinked */ if (inode-i_nlink == 0) return 0; @@ -205,33 +199,12 @@ int unionfs_get_nlinks(const struct inode *inode) if (!S_ISDIR(inode-i_mode)) return unionfs_lower_inode(inode)-i_nlink; - for (bindex = ibstart(inode); bindex = ibend(inode); bindex++) { - lower_inode = unionfs_lower_inode_idx(inode, bindex); - - /* ignore files */ - if (!lower_inode || !S_ISDIR(lower_inode-i_mode)) - continue; - - BUG_ON(lower_inode-i_nlink 0); - - /* A deleted directory. */ - if (lower_inode-i_nlink == 0) - continue; - dirs++; - - /* -* A broken directory... -* -* Some filesystems don't properly set the number of links -* on empty directories -*/ - if (lower_inode-i_nlink == 1) - sum_nlinks += 2; - else - sum_nlinks += (lower_inode-i_nlink - 2); - } - - return (!dirs ? 0 : sum_nlinks + 2); + /* +* For directories, we return 1. The only place that could cares +* about links is readdir, and there's d_type there so even that +* doesn't matter. +*/ + return 1; } /* construct whiteout filename */ -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 25/25] Unionfs: use poison.h for safe poison pointers
This also fixes a compile warning on 64-bit systems. Signed-off-by: Josef 'Jeff' Sipek [EMAIL PROTECTED] Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/debug.c | 16 ++-- fs/unionfs/union.h |1 + 2 files changed, 7 insertions(+), 10 deletions(-) diff --git a/fs/unionfs/debug.c b/fs/unionfs/debug.c index 09b52ce..b103eb9 100644 --- a/fs/unionfs/debug.c +++ b/fs/unionfs/debug.c @@ -25,14 +25,6 @@ } \ } while (0) -#if BITS_PER_LONG == 32 -#define POISONED_PTR ((void*) 0x5a5a5a5a) -#elif BITS_PER_LONG == 64 -#define POISONED_PTR ((void*) 0x5a5a5a5a5a5a5a5a) -#else -#error Unknown BITS_PER_LONG value -#endif /* BITS_PER_LONG != known */ - /* * __unionfs_check_{inode,dentry,file} perform exhaustive sanity checking on * the fan-out of various Unionfs objects. We check that no lower objects @@ -50,6 +42,7 @@ void __unionfs_check_inode(const struct inode *inode, struct inode *lower_inode; struct super_block *sb; int printed_caller = 0; + void *poison_ptr; /* for inodes now */ BUG_ON(!inode); @@ -88,12 +81,13 @@ void __unionfs_check_inode(const struct inode *inode, } lower_inode = unionfs_lower_inode_idx(inode, bindex); if (lower_inode) { + memset(poison_ptr, POISON_INUSE, sizeof(void *)); if (unlikely(bindex istart || bindex iend)) { PRINT_CALLER(fname, fxn, line); printk( Ci5: inode/linode=%p:%p bindex=%d istart/end=%d:%d\n, inode, lower_inode, bindex, istart, iend); - } else if (unlikely(lower_inode == POISONED_PTR)) { + } else if (unlikely(lower_inode == poison_ptr)) { /* freed inode! */ PRINT_CALLER(fname, fxn, line); printk( Ci6: inode/linode=%p:%p bindex=%d @@ -131,6 +125,7 @@ void __unionfs_check_dentry(const struct dentry *dentry, struct super_block *sb; struct vfsmount *lower_mnt; int printed_caller = 0; + void *poison_ptr; BUG_ON(!dentry); sb = dentry-d_sb; @@ -257,12 +252,13 @@ void __unionfs_check_dentry(const struct dentry *dentry, for (bindex = sbstart(sb); bindex sbmax(sb); bindex++) { lower_inode = unionfs_lower_inode_idx(inode, bindex); if (lower_inode) { + memset(poison_ptr, POISON_INUSE, sizeof(void *)); if (unlikely(bindex istart || bindex iend)) { PRINT_CALLER(fname, fxn, line); printk( CI5: dentry/linode=%p:%p bindex=%d istart/end=%d:%d\n, dentry, lower_inode, bindex, istart, iend); - } else if (unlikely(lower_inode == POISONED_PTR)) { + } else if (unlikely(lower_inode == poison_ptr)) { /* freed inode! */ PRINT_CALLER(fname, fxn, line); printk( CI6: dentry/linode=%p:%p bindex=%d diff --git a/fs/unionfs/union.h b/fs/unionfs/union.h index 8df44a9..510267f 100644 --- a/fs/unionfs/union.h +++ b/fs/unionfs/union.h @@ -43,6 +43,7 @@ #include linux/fs_stack.h #include linux/magic.h #include linux/log2.h +#include linux/poison.h #include asm/mman.h #include asm/system.h -- 1.5.2.2 - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 10/25] Unionfs: add un/likely conditionals on copyup ops
Erez Zadok wrote: Signed-off-by: Erez Zadok [EMAIL PROTECTED] --- fs/unionfs/copyup.c | 102 +- 1 files changed, 51 insertions(+), 51 deletions(-) diff --git a/fs/unionfs/copyup.c b/fs/unionfs/copyup.c index 23ac4c8..e3c5f15 100644 --- a/fs/unionfs/copyup.c +++ b/fs/unionfs/copyup.c @@ -36,14 +36,14 @@ static int copyup_xattrs(struct dentry *old_lower_dentry, /* query the actual size of the xattr list */ list_size = vfs_listxattr(old_lower_dentry, NULL, 0); - if (list_size = 0) { + if (unlikely(list_size = 0)) { I've been told several times that adding these is almost always bogus - either it messes up the CPU branch prediction or the compiler/CPU just does a lot better at finding the right way without these hints. Adding them as a blanket seems rather strange. Have you got any numbers that this really improves performance? Auke - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Upgrading datastructures between different filesystem versions
kernel learner wrote: Hi, ext3 filesystem has 32-bit block address and ext4 filesystem has 48-bit block address. If a user installs ext4, how will the file system handle already existing block with 32 bit values? Why should it ? thats what ext3 is for. your kernel can have both FS's supported, and would use ext3 drivers for ext3 filesystems its asked to mount. Id expect ext4 drivers handling ext3 filesystems is a distant, secondary goal to getting a fast, reliable, clean 48bit filesystem working. Can anyone point me to the correct pointer for this backward compatibility stuff? I searched for it but cudn't find much info. Is the work still pending on this front? Thanks, KL - To unsubscribe from this list: send the line unsubscribe linux-fsdevel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html