Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-11-26 Thread Dave Chinner
it within their filesystem via ioctls like everyone else does. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vge

[PATCH 06/19] list: add a new LRU list type

2012-11-27 Thread Dave Chinner
From: Dave Chinner Several subsystems use the same construct for LRU lists - a list head, a spin lock and and item count. They also use exactly the same code for adding and removing items from the LRU. Create a generic type for these LRU lists. This is the beginning of generic, node aware LRUs

[PATCH 14/19] xfs: use generic AG walk for background inode reclaim

2012-11-27 Thread Dave Chinner
From: Dave Chinner The per-ag inode cache radix trees are not walked by the shrinkers any more, so there is no need for a special walker that contained heurisitcs to prevent multiple shrinker instances from colliding with each other. Hence we can just remote that and convert the code to use the

[PATCH 18/19] shrinker: convert remaining shrinkers to count/scan API

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert the remaining couple of random shrinkers in the tree to the new API. Signed-off-by: Dave Chinner --- arch/x86/kvm/mmu.c | 35 +-- net/sunrpc/auth.c | 45 +++-- 2 files changed, 56

[PATCH 19/19] shrinker: Kill old ->shrink API.

2012-11-27 Thread Dave Chinner
From: Dave Chinner There are no more users of this API, so kill it dead, dead, dead and quietly bury the corpse in a shallow, unmarked grave in a dark forest deep in the hills... Signed-off-by: Dave Chinner --- include/linux/shrinker.h | 15 +-- include/trace/events

[PATCH 10/19] shrinker: add node awareness

2012-11-27 Thread Dave Chinner
From: Dave Chinner Pass the node of the current zone being reclaimed to shrink_slab(), allowing the shrinker control nodemask to be set appropriately for node aware shrinkers. Signed-off-by: Dave Chinner --- fs/drop_caches.c |1 + include/linux/shrinker.h |3 +++ mm/memory

[PATCH 13/19] xfs: Node aware direct inode reclaim

2012-11-27 Thread Dave Chinner
From: Dave Chinner XFS currently only tracks inodes for reclaim via tag bits in the inode cache radix tree. While this is awesome for background reclaim because it allows inodes to be reclaimed in ascending disk offset order, it sucks for direct memory reclaim which really is trying to free the

[PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert the driver shrinkers to the new API. Most changes are compile tested only because I either don't have the hardware or it's staging stuff. FWIW, the md and android code is pretty good, but the rest of it makes me want to claw my eyes out. The amount of bro

[PATCH 01/19] dcache: convert dentry_stat.nr_unused to per-cpu counters

2012-11-27 Thread Dave Chinner
From: Dave Chinner Before we split up the dcache_lru_lock, the unused dentry counter needs to be made independent of the global dcache_lru_lock. Convert it to per-cpu counters to do this. Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/dcache.c | 17 ++--- 1

[PATCH 16/19] fs: convert fs shrinkers to new scan/count API

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert the filesystem shrinkers to use the new API, and standardise some of the behaviours of the shrinkers at the same time. For example, nr_to_scan means the number of objects to scan, not the number of objects to free. I refactored the CIFS idmap shrinker a little - it

[PATCH 02/19] dentry: move to per-sb LRU locks

2012-11-27 Thread Dave Chinner
From: Dave Chinner With the dentry LRUs being per-sb structures, there is no real need for a global dentry_lru_lock. The locking can be made more fine-grained by moving to a per-sb LRU lock, isolating the LRU operations of different filesytsems completely from each other. Signed-off-by: Dave

[PATCH 12/19] xfs: convert buftarg LRU to generic code

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert the buftarg LRU to use the new generic LRU list and take advantage of the functionality it supplies to make the buffer cache shrinker node aware. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 162 +- fs/xfs

[PATCH 08/19] dcache: convert to use new lru list infrastructure

2012-11-27 Thread Dave Chinner
From: Dave Chinner Signed-off-by: Dave Chinner --- fs/dcache.c| 171 +--- fs/super.c | 10 +-- include/linux/fs.h | 15 +++-- 3 files changed, 82 insertions(+), 114 deletions(-) diff --git a/fs/dcache.c b/fs/dcache.c index

[PATCH 09/19] list_lru: per-node list infrastructure

2012-11-27 Thread Dave Chinner
From: Dave Chinner Now that we have an LRU list API, we can start to enhance the implementation. This splits the single LRU list into per-node lists and locks to enhance scalability. Items are placed on lists according to the node the memory belongs to. To make scanning the lists efficient

[PATCH 07/19] inode: convert inode lru list to generic lru list code.

2012-11-27 Thread Dave Chinner
From: Dave Chinner Signed-off-by: Dave Chinner --- fs/inode.c | 173 +--- fs/super.c | 11 ++-- include/linux/fs.h |6 +- 3 files changed, 75 insertions(+), 115 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index

[PATCH 03/19] dcache: remove dentries from LRU before putting on dispose list

2012-11-27 Thread Dave Chinner
From: Dave Chinner One of the big problems with modifying the way the dcache shrinker and LRU implementation works is that the LRU is abused in several ways. One of these is shrink_dentry_list(). Basically, we can move a dentry off the LRU onto a different list without doing any accounting

[RFC, PATCH 00/19] Numa aware LRU lists and shrinkers

2012-11-27 Thread Dave Chinner
Hi Glauber, Here's a working version of my patchset for generic LRU lists and NUMA-aware shrinkers. There are several parts to this patch set. The NUMA aware shrinkers are based on having a generic node-based LRU list implementation, and there are subsystems that need to be converted to use these

[PATCH 15/19] xfs: convert dquot cache lru to list_lru

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert the XFS dquot lru to use the list_lru construct and convert the shrinker to being node aware. Signed-off-by: Dave Chinner --- fs/xfs/xfs_dquot.c |7 +- fs/xfs/xfs_qm.c| 307 ++-- fs/xfs/xfs_qm.h|4

[PATCH 05/19] shrinker: convert superblock shrinkers to new API

2012-11-27 Thread Dave Chinner
From: Dave Chinner Convert superblock shrinker to use the new count/scan API, and propagate the API changes through to the filesystem callouts. The filesystem callouts already use a count/scan API, so it's just changing counters to longs to match the VM API. This requires the dentry and

[PATCH 11/19] fs: convert inode and dentry shrinking to be node aware

2012-11-27 Thread Dave Chinner
From: Dave Chinner Now that the shrinker is passing a nodemask in the scan control structure, we can pass this to the the generic LRU list code to isolate reclaim to the lists on matching nodes. This requires a small amount of refactoring of the LRU list API, which might be best split out into

[PATCH 04/19] mm: new shrinker API

2012-11-27 Thread Dave Chinner
From: Dave Chinner The current shrinker callout API uses an a single shrinker call for multiple functions. To determine the function, a special magical value is passed in a parameter to change the behaviour. This complicates the implementation and return value specification for the different

Re: [PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-27 Thread Dave Chinner
On Wed, Nov 28, 2012 at 01:13:11AM +, Chris Wilson wrote: > On Wed, 28 Nov 2012 10:14:44 +1100, Dave Chinner wrote: > > +/* > > + * XXX: (dchinner) This is one of the worst cases of shrinker abuse I've > > seen. > > + * > > + * i915_gem_purge() expe

Re: [PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-28 Thread Dave Chinner
On Wed, Nov 28, 2012 at 12:21:54PM +0400, Glauber Costa wrote: > On 11/28/2012 07:17 AM, Dave Chinner wrote: > > On Wed, Nov 28, 2012 at 01:13:11AM +, Chris Wilson wrote: > >> On Wed, 28 Nov 2012 10:14:44 +1100, Dave Chinner > >> wrote: > >>> +/* >

Re: [PATCH] tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)

2012-11-28 Thread Dave Chinner
#x27;s new > SEEK_DATA and SEEK_HOLE options: so add them while the minutiae are still > on my mind (in particular, the !PageUptodate-ness of pages fallocated but > still unwritten). > > [a...@linux-foundation.org: fix warning with CONFIG_TMPFS=n] > Signed-off-by: Hugh Dickins > ---

Re: [PATCH] tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)

2012-11-29 Thread Dave Chinner
824 mmap(NULL, 4294975488, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f60afcf3000 munmap(0x7f61afcf5000, 2147491840) = 0 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2147483648) = 2147479552 read(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0

Re: [PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-29 Thread Dave Chinner
On Thu, Nov 29, 2012 at 02:29:33PM +0400, Glauber Costa wrote: > On 11/29/2012 01:28 AM, Dave Chinner wrote: > > On Wed, Nov 28, 2012 at 12:21:54PM +0400, Glauber Costa wrote: > >> On 11/28/2012 07:17 AM, Dave Chinner wrote: > >>> On Wed, Nov 28, 2012 at 01:13:

Re: [RFC, PATCH 00/19] Numa aware LRU lists and shrinkers

2012-11-29 Thread Dave Chinner
On Thu, Nov 29, 2012 at 11:02:24AM -0800, Andi Kleen wrote: > Dave Chinner writes: > > > > Comments, thoughts and flames all welcome. > > Doing the reclaim per CPU sounds like a big change in the VM balance. It's per node, not per CPU. And AFAICT, it hasn't chang

Re: [PATCH v2] Do a proper locking for mmap and block size change

2012-11-29 Thread Dave Chinner
ith mpage_readpages(), so it's not just direct IO that has this problem Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 2/9] vfs: export do_splice_direct() to modules

2013-03-17 Thread Dave Chinner
ead to deadlocks. Here's another that's been known for ages: http://oss.sgi.com/archives/xfs/2011-08/msg00168.html Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 11/12] rwsem: wake all readers when first waiter is a reader

2013-03-18 Thread Dave Chinner
On Wed, Mar 13, 2013 at 10:00:51PM -0400, Peter Hurley wrote: > On Wed, 2013-03-13 at 14:23 +1100, Dave Chinner wrote: > > We don't care about the ordering between multiple concurrent > > metadata modifications - what matters is whether the ongoing data IO > > around

Re: [PATCH 3/4] writeback: replace custom worker pool implementation with unbound workqueue

2013-03-20 Thread Dave Chinner
very disk has it's own filesystem), knowing which filesystem(s) are getting stuck in writeback from the sysrq-w or hangcheck output is pretty damn important Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux

Re: [PATCH 0/6 RFC] Mapping range lock

2013-02-05 Thread Dave Chinner
ema we need for filesystems to behave sanely. i.e. shouldn't we be aiming to simplify things as we rework locking rather than make the more complex? IOWs, I think the "it's a mapping range lock" approach is not the right level to be providing IO exclusion semantics. After all, it&

Re: [PATCH 0/6 RFC] Mapping range lock

2013-02-06 Thread Dave Chinner
On Wed, Feb 06, 2013 at 08:25:34PM +0100, Jan Kara wrote: > On Wed 06-02-13 10:25:12, Dave Chinner wrote: > > On Mon, Feb 04, 2013 at 01:38:31PM +0100, Jan Kara wrote: > > > On Thu 31-01-13 16:07:57, Andrew Morton wrote: > > > > > c) i_mutex doesn't allow any

Re: [PATCH] fs: encode_fh: return FILEID_INVALID if invalid fid_type

2013-02-11 Thread Dave Chinner
gt; } > - return 255; /* invalid */ > + return FILEID_INVALID; /* invalid */ > } I think you can drop the "/* invalid */" comment from there now as it is redundant with this change. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this li

Re: 3.8-rc5 xfs corruption

2013-01-30 Thread Dave Chinner
ops cleared before being returned to the new user, and newly allocated xfs_bufs are zeroed before being initialised. I really need to know what you are doing to be able to get to the bottom of it Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the li

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Dave Chinner
On Sat, Feb 23, 2013 at 07:06:10AM +, Tony Lu wrote: > >From: Dave Chinner [mailto:da...@fromorbit.com] > >On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote: > >> I encountered the following panic when using xfs partitions as rootfs, > >> which > >>

Re: Debugging system freezes on filesystem writes

2013-02-23 Thread Dave Chinner
120k 0 0 > >none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0 > >none /run/user tmpfs > >rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0 > >/dev/sda6 /home ext4 rw,noatime,discard 0 0 ^^^ I'd say that's your problem

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-24 Thread Dave Chinner
the log is corrupt and that implies a deeper problem > And also there is code in xlog_write_log_records() which handles > non-sector-align reads and writes. Yes, it does handle it, but that doesn't mean that it is correct to pass unaligned block ranges to it. Cheers, Dave. -- Dave Chi

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-26 Thread Dave Chinner
kernel and determine if you can still reproduce the problem on your system - that way we'll know if the bug really has been fixed or not Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body

Re: [GIT PULL] ext4 updates for 3.9

2013-02-28 Thread Dave Chinner
than 2-3TB. Perhaps it would make testing 1-2TB ext4 filesystems fast enough that you could do it regularly... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org M

Re: [PATCH review 02/16] xfs: Store projectid as a single variable.

2013-02-18 Thread Dave Chinner
; new i_projid field. > > Cc: Ben Myers > Cc: Alex Elder > Cc: Dave Chinner > Signed-off-by: "Eric W. Biederman" > --- > fs/xfs/xfs_icache.c |2 +- > fs/xfs/xfs_inode.c|6 +- > fs/xfs/xfs_inode.h|7 ++- > fs/xfs/xfs_ioctl

Re: [PATCH review 03/16] xfs: Always read uids and gids from the vfs inode

2013-02-18 Thread Dave Chinner
inode->i_gid= ip->i_d.di_gid; Which further empahsises the layer violation... > switch (inode->i_mode & S_IFMT) { > case S_IFBLK: > diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c > index cf5b1d0..a9e07dd 100644 > --- a/fs/xfs/xfs_itabl

Re: [PATCH review 08/16] xfs: Use kprojids when allocating inodes.

2013-02-18 Thread Dave Chinner
ng of project IDs like this this should not be >> converted at all as it has nothing at all to do with the >> namespaces. Please drop this patch or replace it with a simple patch that passes the project ID as an xfs_dqid_t (i.e. a flat, 32 bit quota identifier) instead so you can kill t

Re: [PATCH review 07/16] xfs: Update ioctl(XFS_IOC_FREE_EOFBLOCKS) to handle callers in any userspace

2013-02-18 Thread Dave Chinner
s_eof_blocks to define them all as xfs_dqid_t and convert them in place to the type that is compatible with the XFS core use of these fields (i.e. comparing them with the on-disk inode uid/gid/prid values). Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the

Re: [PATCH review 05/16] xfs: Update xfs_ioctl_setattr to handle projids in any user namespace

2013-02-18 Thread Dave Chinner
nto > xfs's user namespace. > - Replace uses of fa->fsx_projid with projid throughout > xfs_ioctl_setattr. > > Cc: Ben Myers > Cc: Alex Elder > Cc: Dave Chinner > Signed-off-by: "Eric W. Biederman" > --- > fs/xfs/xfs_ioctl.c | 26

Re: [PATCH review 09/16] xfs: Modify xfs_qm_vop_dqalloc to take kuids, kgids, and kprojids.

2013-02-18 Thread Dave Chinner
On Sun, Feb 17, 2013 at 05:11:02PM -0800, Eric W. Biederman wrote: > From: "Eric W. Biederman" > > Cc: Ben Myers > Cc: Alex Elder > Cc: Dave Chinner > Signed-off-by: "Eric W. Biederman" > --- > fs/xfs/xfs_qm.c|6 +++--- > fs/xf

Re: [PATCH review 10/16] xfs: Push struct kqid into xfs_qm_scall_qmlim and xfs_qm_scall_getquota

2013-02-18 Thread Dave Chinner
stent and litters id/namespace conversions all over the place, so i don't think these cahgnes are necessary. Hence I'd say just do absolute minimum needed for the is_superquota() checks to work and leave all the kqid -> xfs_dqid_t+type conversion at the boundary of the quota subsystem wher

Re: [PATCH RFC 10/12] userns: Convert xfs to use kuid/kgid/kprojid where appropriate

2013-02-18 Thread Dave Chinner
On Sun, Feb 17, 2013 at 05:25:43PM -0800, Eric W. Biederman wrote: > Dave Chinner writes: > > > On Wed, Feb 13, 2013 at 10:13:16AM -0800, Eric W. Biederman wrote: > > > >> The crazy thing is that is that xfs appears to > >> directly write their incor

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-21 Thread Dave Chinner
ng two locks: Nobody should be doing reverse dentry-to-name lookups in a quantity sufficient for it to become a performance limiting factor. What is the Oracle DB actually using this path for? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-22 Thread Dave Chinner
On Thu, Feb 21, 2013 at 11:13:27PM -0500, Waiman Long wrote: > On 02/21/2013 07:13 PM, Andi Kleen wrote: > >Dave Chinner writes: > > > >>On Tue, Feb 19, 2013 at 01:50:55PM -0500, Waiman Long wrote: > >>>It was found that the Oracle database software issue

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Dave Chinner
oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=commitdiff;h=f14e527f411712f89178c31370b5d733ea1d0280 FWIW, I think your change might need work - there's the possibility that is can round up the length beyond the end of the log if we ask to read up to the last sector of the log (i.e. blkno + blklen == end

Re: torrent hash failures since 3.9.0-rc1

2013-03-12 Thread Dave Chinner
hese operations can be found using fsx. The issue here, however, involves memory reclaim interactions and so is not something fsx can reproduce in isolation. :/ Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kerne

Re: [PATCH 11/12] rwsem: wake all readers when first waiter is a reader

2013-03-12 Thread Dave Chinner
On Mon, Mar 11, 2013 at 11:43:34PM -0700, Michel Lespinasse wrote: > Hi Dave, > > On Mon, Mar 11, 2013 at 7:36 PM, Dave Chinner wrote: > > On Sun, Mar 10, 2013 at 10:17:42PM -0700, Michel Lespinasse wrote: > >> - since all readers are woken at once, you might see

Re: [PATCH] userns: Add basic quota support v4

2012-08-28 Thread Dave Chinner
init_user_ns) ? qid : -1; > + default: > + BUG(); > + } > + return kqid; > +} > + > +static inline struct kqid make_kqid_invalid(int type) > +{ > + struct kqid kqid; > + > + kqid.type = type; > + switch (type) { > + case USRQUOTA: &g

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-08-28 Thread Dave Chinner
On Mon, Aug 27, 2012 at 06:05:06PM -0400, Naoya Horiguchi wrote: > On Mon, Aug 27, 2012 at 08:26:07AM +1000, Dave Chinner wrote: > > On Fri, Aug 24, 2012 at 01:24:16PM -0400, Naoya Horiguchi wrote: > > > Let me explain more to clarify my whole scenario. If a memory error >

Re: [PATCH] userns: Add basic quota support v4

2012-08-30 Thread Dave Chinner
On Wed, Aug 29, 2012 at 02:31:26AM -0700, Eric W. Biederman wrote: > > Dave thanks for taking the time to take a detailed look at this code. > > Dave Chinner writes: > > > On Tue, Aug 28, 2012 at 12:09:56PM -0700, Eric W. Biederman wrote: > >> > >> Ad

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-09-02 Thread Dave Chinner
On Wed, Aug 29, 2012 at 02:32:04PM +0900, Jun'ichi Nomura wrote: > On 08/29/12 11:59, Dave Chinner wrote: > > On Mon, Aug 27, 2012 at 06:05:06PM -0400, Naoya Horiguchi wrote: > >> And yes, I understand it's ideal, but many applications choose not to > >> do t

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-02 Thread Dave Chinner
gh stack? I mean, we can deal with it like the ia32 4k stack issue was dealt with (i.e. ignore those stupid XFS people, that's an XFS bug), or we can face the reality that storage stacks have become so complex that 8k is no longer a big enough stack for a modern system Cheers, Dave. --

Re: 3.5.2: moving files from xfs/disk -> nfs: radix_tree_lookup_slot+0xe/0x10

2012-09-02 Thread Dave Chinner
[] iterate_supers+0xe1/0xf0 > > [75716.705798] [] sys_sync+0x2b/0x60 > > [75716.705802] [] system_call_fastpath+0x1a/0x1f > > [75836.701197] INFO: task sync:8790 blocked for more than 120 seconds. Which simply says that writeback of the dirty data at the time of the sync call has

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-04 Thread Dave Chinner
orkqueue to avoid stack overflows, then the context switches are going to cause significant performance regressions for high IOPS workloads. I don't really like either situation. So while you are discussing stack issues, think a little about the bigger picture outside of the immediate

Re: [PATCH 1/3] Add ratelimited printk for different alert levels

2012-09-11 Thread Dave Chinner
aren't used. xfs_dbg does not exist - the function is xfs_debug(). The compiler won't catch that until the macro is used, so only add the macros which are needed for this patch series. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 2/3] XFS: Print error when xfs_ialloc_ag_select fails to find continuous free space.

2012-09-11 Thread Dave Chinner
etter at the caller site, not in the function itself. i.e. if we get a NULLAGNUMBER returned, the caller decided whether to emit an error message or not. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the b

Re: [PATCH 3/3] XFS: Print error when unable to allocate inodes or out of free inodes.

2012-09-11 Thread Dave Chinner
} > + return 0; > } > } > > +out_spc: > + *inop = NULLFSINO; > + return ENOSPC; > out_alloc: > *IO_agbp = NULL; > return xfs_dialloc_ag(tp, agbp, parent, inop); Default behaviour on a loo

Re: [PATCH 1/3] Add ratelimited printk for different alert levels

2012-09-12 Thread Dave Chinner
7;t a massive increase in size as a result of this. If we do start to use ratelimiting in lots of places in XFS, then we might have to revisit this, but it's OK for now. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH] loop: Limit the number of requests in the bio list

2012-10-02 Thread Dave Chinner
On Tue, Oct 02, 2012 at 10:52:05AM +0200, Lukáš Czerner wrote: > On Mon, 1 Oct 2012, Jeff Moyer wrote: > > Date: Mon, 01 Oct 2012 12:52:19 -0400 > > From: Jeff Moyer > > To: Lukas Czerner > > Cc: Jens Axboe , linux-kernel@vger.kernel.org, > > Dave Chinn

Re: [RFC, PATCH] Extensible AIO interface

2012-10-02 Thread Dave Chinner
ix_fadvise() can't really specifify everything we'd > want to for an SSD cache). Similar discussions about posix_fadvise() are being had for marking ranges of files as volatile (i.e. useful for determining what can be evicted from a cache when space reclaim is required). http

Re: [RFC, PATCH] Extensible AIO interface

2012-10-03 Thread Dave Chinner
On Tue, Oct 02, 2012 at 07:41:10PM -0700, Kent Overstreet wrote: > On Wed, Oct 03, 2012 at 11:28:25AM +1000, Dave Chinner wrote: > > On Tue, Oct 02, 2012 at 05:20:29PM -0700, Kent Overstreet wrote: > > > On Tue, Oct 02, 2012 at 01:41:17PM -0400, Jeff Moyer wrote: > > >

Re: [PATCH] mm: readahead: remove redundant ra_pages in file_ra_state

2012-10-23 Thread Dave Chinner
files are opened (e.g. via udev rules). Hence you need to explain why you need to change the default block device readahead on the fly, and why fadvise(POSIX_FADV_NORMAL) is "inappropriate" to set readahead windows to the new defaults. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 11/16] f2fs: add inode operations for special inodes

2012-10-23 Thread Dave Chinner
single indirect node (covering block pointers for 4 MB), plus 256 > separate block pointers (covering the last megabyte), and a 5 GB file > can be represented using 1 double-indirect node and 256 indirect nodes, > and each of them can still be followed by direct "tail" data and &

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-23 Thread Dave Chinner
ix this race by using a read lock around I/O paths and write lock > around block size changing, but normal rw semaphore cause cache line > bouncing when taken for read by multiple processors and I/O performance > degradation because of it is measurable. This doesn't sound like a new

Re: [PATCH] mm: readahead: remove redundant ra_pages in file_ra_state

2012-10-24 Thread Dave Chinner
On Wed, Oct 24, 2012 at 07:53:59AM +0800, YingHang Zhu wrote: > Hi Dave, > On Wed, Oct 24, 2012 at 6:47 AM, Dave Chinner wrote: > > On Tue, Oct 23, 2012 at 08:46:51PM +0800, Ying Zhu wrote: > >> Hi, > >> Recently we ran into the bug that an opened file's ra_p

Re: [PATCH] mm: readahead: remove redundant ra_pages in file_ra_state

2012-10-24 Thread Dave Chinner
On Thu, Oct 25, 2012 at 08:17:05AM +0800, YingHang Zhu wrote: > On Thu, Oct 25, 2012 at 4:19 AM, Dave Chinner wrote: > > On Wed, Oct 24, 2012 at 07:53:59AM +0800, YingHang Zhu wrote: > >> Hi Dave, > >> On Wed, Oct 24, 2012 at 6:47 AM, Dave Chinner wrote: > >>

Re: [PATCH 1/2] brw_mutex: big read-write mutex

2012-10-25 Thread Dave Chinner
On Thu, Oct 25, 2012 at 10:09:31AM -0400, Mikulas Patocka wrote: > > > On Wed, 24 Oct 2012, Dave Chinner wrote: > > > On Fri, Oct 19, 2012 at 06:54:41PM -0400, Mikulas Patocka wrote: > > > > > > > > > On Fri, 19 Oct 2012, Peter Zijlstra wrote: >

Re: [PATCH] mm: readahead: remove redundant ra_pages in file_ra_state

2012-10-25 Thread Dave Chinner
ages here as it is not a set-and-forget value. e.g. shrink_readahead_size_eio() can reduce ra_pages as a result of IO errors. Hence if you have had io errors, telling the kernel that you are now going to do sequential IO should reset the readahead to the maximum ra_pages value supported Cheers,

Re: Hang in XFS reclaim on 3.7.0-rc3

2012-11-01 Thread Dave Chinner
On Thu, Nov 01, 2012 at 04:30:10PM -0500, Ben Myers wrote: > Hi Dave, > > On Tue, Oct 30, 2012 at 09:26:13AM +1100, Dave Chinner wrote: > > On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote: > > > After experiencing a hang of all IO yesterday ( > > >

Re: VFS hot tracking: How to calculate data temperature?

2012-11-05 Thread Dave Chinner
equirements before discussing how or what to implement. Indeed, discussion shoul dreally focus on getting the core, in-memory infrastructure sorted out first before trying to expand the functionality further... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list:

Re: Problem with DISCARD and RAID5

2012-11-05 Thread Dave Chinner
of megabytes > If bio isn't discard > aligned, what device will do? Up to the device. > Further, why driver handles alignment/granularity > if device will ignore misaligned request. When you send a series of sequential unaligned requests, the device may ignore them all. Hence you en

Re: [PATCH 0/8] Set bi_rw when alloc bio before call bio_add_page.

2012-07-30 Thread Dave Chinner
he optimisation. I can't evalute the merit of this change without data telling me it is worthwhile, and it's a lot of code to churn for no benefit Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the

Re: [PATCH 1/1] xfs: check for possible overflow in xfs_ioc_trim

2012-07-30 Thread Dave Chinner
XFS_FSB_TO_B(mp, mp->m_sb.sb_dblocks)) return -XFS_ERROR(EINVAL); > start = BTOBB(range.start); > end = start + BTOBBT(range.len) - 1; > minlen = BTOBB(max_t(u64, granularity, range.minlen)); And that will prevent the overflow in BTOBB() just as effectively... Che

Re: Re: [PATCH 0/8] Set bi_rw when alloc bio before call bio_add_page.

2012-07-30 Thread Dave Chinner
On Tue, Jul 31, 2012 at 08:55:59AM +0800, majianpeng wrote: > On 2012-07-31 05:42 Dave Chinner Wrote: > >On Mon, Jul 30, 2012 at 03:14:28PM +0800, majianpeng wrote: > >> When exec bio_alloc, the bi_rw is zero.But after calling bio_add_page, > >> it will use bi_rw. >

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-21 Thread Dave Chinner
eating a new one. So I don't think this is a good idea at all... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-23 Thread Dave Chinner
On Mon, Sep 24, 2012 at 10:42:21AM +0800, Guo Chao wrote: > On Sat, Sep 22, 2012 at 08:49:12AM +1000, Dave Chinner wrote: > > > On Fri, Sep 21, 2012 at 05:31:02PM +0800, Guo Chao wrote: > > > This patchset optimizes several places which take the per inode spin lock. > >

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-23 Thread Dave Chinner
On Mon, Sep 24, 2012 at 02:12:05PM +0800, Guo Chao wrote: > On Mon, Sep 24, 2012 at 02:23:43PM +1000, Dave Chinner wrote: > > On Mon, Sep 24, 2012 at 10:42:21AM +0800, Guo Chao wrote: > > > On Sat, Sep 22, 2012 at 08:49:12AM +1000, Dave Chinner wrote: > > > > >

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-24 Thread Dave Chinner
On Mon, Sep 24, 2012 at 03:08:52PM +0800, Guo Chao wrote: > On Mon, Sep 24, 2012 at 04:28:12PM +1000, Dave Chinner wrote: > > > Ah, this is intended to be a code clean patchset actually. I thought these > > > locks are redundant in an obvious and trivial manner. If, o

Re: [PATCH v3 1/2] writeback: add dirty_background_centisecs per bdi variable

2012-09-24 Thread Dave Chinner
asure enterprise level NFS servers. 5. Are the improvements consistent across different filesystem types? We've had writeback changes in the past cause improvements on one filesystem but significant regressions on others. I'd suggest that you need to present results for ext4, XFS and btrfs so that we have a decent idea of what we can expect from the change to the generic code. Yeah, I'm asking a lot of questions. That's because the generic writeback code is extremely important to performance and the impact of a change cannot be evaluated from a single test. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [RFC v2 01/10] vfs: introduce private rb structures

2012-09-25 Thread Dave Chinner
em associated with this hot_range_item */ > + struct hot_inode_item *hot_inode; > + /* starting offset of this range */ > + u64 start; > + /* length of this range */ > + u64 len; What units? u64 start; /* start offset in bytes */ u64 len /* l

Re: [RFC v2 02/10] vfs: add support for updating access frequency

2012-09-25 Thread Dave Chinner
all the abstraction down to something much simpler, say: int hot_inode_update_freqs() { he = hot_inode_item_find(tree, inum, null) if (!he) { new_he = allocate() if (!new_he) return -ENOMEM;

Re: [RFC v2 03/10] vfs: add one new mount option '-o hottrack'

2012-09-25 Thread Dave Chinner
tracking hot inodes or not. This then means the hot inode tracking for the superblock can be initialised by the filesystem as part of it's fill_super method, along with the filesystem specific code that will use the hot tracking information the VFS gathers Cheers, Dave. -- Dave Chinner

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-25 Thread Dave Chinner
On Tue, Sep 25, 2012 at 04:59:55PM +0800, Guo Chao wrote: > On Mon, Sep 24, 2012 at 06:26:54PM +1000, Dave Chinner wrote: > > @@ -783,14 +783,19 @@ static void __wait_on_freeing_inode(struct inode > > *inode); > > static struct inode *find_inode(s

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-28 Thread Dave Chinner
esting hasn't been taught > what it is, so strerror(EHWPOISON) returns "Unknown error 133". We > could simply allow open(2) and stat(2) return this error, although I > wonder if we're just better off defining a new error code. If we are going to add special new "file co

Re: [RFC v4 03/15] vfs,hot_track: add the function for collecting I/O frequency

2012-10-28 Thread Dave Chinner
at is a win. But . > >> +EXPORT_SYMBOL_GPL(hot_update_freqs); ... it's an exported function, so it can't be inline or static, so using "inline" is wrong whatever way you look at it. ;) Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this

Re: Hang in XFS reclaim on 3.7.0-rc3

2012-10-29 Thread Dave Chinner
lockdep output that implicate GFP_KERNEL allocations from vm_map_ram in GFP_NOFS conditions as the potential cause.... Cheers, Dave. -- Dave Chinner da...@fromorbit.com xfs: don't vmap inode cluster buffers during free From: Dave Chinner Signed-off-by: Dave Chinner --- fs/xfs/xfs_inode.c

Re: Hang in XFS reclaim on 3.7.0-rc3

2012-10-29 Thread Dave Chinner
[add the linux-mm cc I forgot to add before sending] On Tue, Oct 30, 2012 at 09:26:13AM +1100, Dave Chinner wrote: > On Mon, Oct 29, 2012 at 09:03:15PM +0100, Torsten Kaiser wrote: > > After experiencing a hang of all IO yesterday ( > > http://marc.info/?l=linux-kernel&m=13514

Re: [PATCH 2/3] ext4: introduce ext4_error_remove_page

2012-10-30 Thread Dave Chinner
data. Hence *anything* that the kernel wants to store on permanent storage should be using xattrs because then the application has complete control of what is stored without caring about what filesystem it is storing it on. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from t

Re: [PATCH] Avoid useless inodes and dentries reclamation

2013-08-29 Thread Dave Chinner
s) then such a threshold might only be appropriate for caches that are not memcg controlled. In that case, handling it in the shrinker infrastructure itself is a much better idea than hacking thresholds into individual shrinker callouts. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsub

Re: [dm-devel] Reworking dm-writeboost [was: Re: staging: Add dm-writeboost]

2013-10-03 Thread Dave Chinner
optimisations should be done at the point where they are issued - any attempt to further optimise them by adding delays down in the stack to aggregate FUA operations will only increase latency of the operations that the issuer want to have complete as fast as possible Cheers, Dave. -- Dave Ch

Re: fs/attr.c:notify_change locking warning.

2013-10-04 Thread Dave Chinner
>setattr(). If it really matters, I'll just open code file_remove_suid() into XFS like ocfs2 does just so we don't get that warning being emitted by trinity. FWIW, buffered IO on XFS - the normal case for most operations - holds the i_mutex over the call to file_remove_

Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

2013-07-09 Thread Dave Chinner
On Mon, Jul 08, 2013 at 02:53:52PM +0200, Michal Hocko wrote: > On Thu 04-07-13 18:36:43, Michal Hocko wrote: > > On Wed 03-07-13 21:24:03, Dave Chinner wrote: > > > On Tue, Jul 02, 2013 at 02:44:27PM +0200, Michal Hocko wrote: > > > > On Tue 02-07-1

Re: [PATCH 10/13] xfs: use get_unused_fd_flags(0) instead of get_unused_fd()

2013-07-10 Thread Dave Chinner
4 > > So there's no many *known* users of this features ... but it's more > important > not to break *unknown* users of it. There are commercial products (i.e. proprietary, closed source) that use it. SGI has one (DMF) and there are a couple of other backup program

Re: [PATCH] fs: sync: fixed performance regression

2013-07-10 Thread Dave Chinner
files and then the next power of > two up to 262144 files. > > Note, when running the test, the slow down doesn't always happen > but most of the tests will show a slow down. Can you please check if the patch attached to this mail: http://marc.info/?l=linux-kernel&m=137276874

  1   2   3   4   5   6   7   8   9   10   >