Re: [PATCH 09/19] list_lru: per-node list infrastructure

2013-01-17 Thread Dave Chinner
them appropriately according to some handwave criteria. Then all the generic LRU code cares about is that the memcg lookup returns the correct struct lru_list for it to operate on... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe

Re: [PATCH 09/19] list_lru: per-node list infrastructure

2013-01-18 Thread Dave Chinner
On Thu, Jan 17, 2013 at 04:51:03PM -0800, Glauber Costa wrote: On 01/17/2013 04:10 PM, Dave Chinner wrote: and we end up with: lru_add(struct lru_list *lru, struct lru_item *item) { node_id = min(object_to_nid(item), lru-numnodes); __lru_add(lru, node_id, item

Re: [PATCH 09/19] list_lru: per-node list infrastructure

2013-01-18 Thread Dave Chinner
On Thu, Jan 17, 2013 at 04:14:10PM -0800, Glauber Costa wrote: On 01/17/2013 04:10 PM, Dave Chinner wrote: And then each object uses: struct lru_item { struct list_head global_list; struct list_head memcg_list; } by objects you mean dentries, inodes, and the such, right

Re: [PATCH 09/19] list_lru: per-node list infrastructure

2013-01-18 Thread Dave Chinner
On Fri, Jan 18, 2013 at 11:10:00AM -0800, Glauber Costa wrote: On 01/18/2013 12:11 AM, Dave Chinner wrote: On Thu, Jan 17, 2013 at 04:14:10PM -0800, Glauber Costa wrote: On 01/17/2013 04:10 PM, Dave Chinner wrote: And then each object uses: struct lru_item { struct list_head

Re: [ 68/89] xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end

2013-02-13 Thread Dave Chinner
. -- From: Dave Chinner dchin...@redhat.com commit eb178619f930fa2ba2348de332a1ff1c66a31424 upstream. When _xfs_buf_find is passed an out of range address, it will fail to find a relevant struct xfs_perag and oops with a null dereference. This can happen

Re: [PATCH RFC 10/12] userns: Convert xfs to use kuid/kgid/kprojid where appropriate

2013-02-13 Thread Dave Chinner
On Wed, Feb 13, 2013 at 10:13:16AM -0800, Eric W. Biederman wrote: Joel Becker jl...@evilplan.org writes: On Wed, Nov 21, 2012 at 10:55:24AM +1100, Dave Chinner wrote: diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 2778258..3656b88 100644 --- a/fs/xfs/xfs_inode.c +++ b

Re: [ 68/89] xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end

2013-02-14 Thread Dave Chinner
. Sounds like a fine idea, Greg. Here are the usual suspects: Ben Myers b...@sgi.com Mark Tinguely tingu...@sgi.com Dave Chinner dchin...@redhat.com Eric Sandeen sand...@redhat.com I don't think it should be restricted to individuals. The private thread used to request this backport

Re: [ 01/10] Revert: xfs: fix _xfs_buf_find oops on blocks beyond the filesystem end

2013-02-16 Thread Dave Chinner
. It has been reported to cause problems: http://bugzilla.redhat.com/show_bug.cgi?id=909602 Acked-by: Ben Myers b...@sgi.com Cc: Dave Chinner dchin...@redhat.com Cc: Brian Foster bfos...@redhat.com Cc: CAI Qian caiq...@redhat.com Cc: Paolo Bonzini pbonz...@redhat.com Signed-off

Re: [PATCH review 02/16] xfs: Store projectid as a single variable.

2013-02-18 Thread Dave Chinner
...@sgi.com Cc: Alex Elder el...@kernel.org Cc: Dave Chinner da...@fromorbit.com Signed-off-by: Eric W. Biederman ebied...@xmission.com --- fs/xfs/xfs_icache.c |2 +- fs/xfs/xfs_inode.c|6 +- fs/xfs/xfs_inode.h|7 ++- fs/xfs/xfs_ioctl.c|6 +++--- fs

Re: [PATCH review 03/16] xfs: Always read uids and gids from the vfs inode

2013-02-18 Thread Dave Chinner
-di_uid; - buf-bs_gid = dic-di_gid; + buf-bs_uid = VFS_I(ip)-i_uid; + buf-bs_gid = VFS_I(ip)-i_gid; Same as the project ID changes - bulkstat is supposed to return the raw on disk values, not namespace munged values. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe

Re: [PATCH review 08/16] xfs: Use kprojids when allocating inodes.

2013-02-18 Thread Dave Chinner
at all as it has nothing at all to do with the namespaces. Please drop this patch or replace it with a simple patch that passes the project ID as an xfs_dqid_t (i.e. a flat, 32 bit quota identifier) instead so you can kill the prid_t type. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH review 07/16] xfs: Update ioctl(XFS_IOC_FREE_EOFBLOCKS) to handle callers in any userspace

2013-02-18 Thread Dave Chinner
all as xfs_dqid_t and convert them in place to the type that is compatible with the XFS core use of these fields (i.e. comparing them with the on-disk inode uid/gid/prid values). Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH review 05/16] xfs: Update xfs_ioctl_setattr to handle projids in any user namespace

2013-02-18 Thread Dave Chinner
user namespace. - Replace uses of fa-fsx_projid with projid throughout xfs_ioctl_setattr. Cc: Ben Myers b...@sgi.com Cc: Alex Elder el...@kernel.org Cc: Dave Chinner da...@fromorbit.com Signed-off-by: Eric W. Biederman ebied...@xmission.com --- fs/xfs/xfs_ioctl.c | 26

Re: [PATCH review 09/16] xfs: Modify xfs_qm_vop_dqalloc to take kuids, kgids, and kprojids.

2013-02-18 Thread Dave Chinner
On Sun, Feb 17, 2013 at 05:11:02PM -0800, Eric W. Biederman wrote: From: Eric W. Biederman ebied...@xmission.com Cc: Ben Myers b...@sgi.com Cc: Alex Elder el...@kernel.org Cc: Dave Chinner da...@fromorbit.com Signed-off-by: Eric W. Biederman ebied...@xmission.com --- fs/xfs/xfs_qm.c

Re: [PATCH review 10/16] xfs: Push struct kqid into xfs_qm_scall_qmlim and xfs_qm_scall_getquota

2013-02-18 Thread Dave Chinner
I'd say just do absolute minimum needed for the is_superquota() checks to work and leave all the kqid - xfs_dqid_t+type conversion at the boundary of the quota subsystem where it already is Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line

Re: [PATCH RFC 10/12] userns: Convert xfs to use kuid/kgid/kprojid where appropriate

2013-02-18 Thread Dave Chinner
On Sun, Feb 17, 2013 at 05:25:43PM -0800, Eric W. Biederman wrote: Dave Chinner da...@fromorbit.com writes: On Wed, Feb 13, 2013 at 10:13:16AM -0800, Eric W. Biederman wrote: The crazy thing is that is that xfs appears to directly write their incore inode structure into their journal

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-21 Thread Dave Chinner
: Nobody should be doing reverse dentry-to-name lookups in a quantity sufficient for it to become a performance limiting factor. What is the Oracle DB actually using this path for? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH 0/4] dcache: make Oracle more scalable on large systems

2013-02-22 Thread Dave Chinner
at what the kernel is doing via diagnostic interfaces so often that it gets in the way of the kernel actually doing stuff is not a problem the kernel can solve. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-22 Thread Dave Chinner
=f14e527f411712f89178c31370b5d733ea1d0280 FWIW, I think your change might need work - there's the possibility that is can round up the length beyond the end of the log if we ask to read up to the last sector of the log (i.e. blkno + blklen == end of log) and then round up blklen by one sector Cheers, Dave. -- Dave Chinner da

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-23 Thread Dave Chinner
On Sat, Feb 23, 2013 at 07:06:10AM +, Tony Lu wrote: From: Dave Chinner [mailto:da...@fromorbit.com] On Fri, Feb 22, 2013 at 08:12:52AM +, Tony Lu wrote: I encountered the following panic when using xfs partitions as rootfs, which is due to the truncated log data read

Re: Debugging system freezes on filesystem writes

2013-02-23 Thread Dave Chinner
,size=102400k,mode=755 0 0 /dev/sda6 /home ext4 rw,noatime,discard 0 0 ^^^ I'd say that's your problem Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: [PATCH] xfs: Fix possible truncation of log data in xlog_bread_noalign()

2013-02-24 Thread Dave Chinner
and writes. Yes, it does handle it, but that doesn't mean that it is correct to pass unaligned block ranges to it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: [RFC v1 00/11] vfs: hot data tracking

2012-09-17 Thread Dave Chinner
comment on the code when I get a bit of time to look at it. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [RFC v1 00/11] vfs: hot data tracking

2012-09-18 Thread Dave Chinner
On Tue, Sep 18, 2012 at 10:24:55AM +0800, Zhi Yong Wu wrote: On Tue, Sep 18, 2012 at 5:30 AM, Dave Chinner da...@fromorbit.com wrote: On Mon, Sep 17, 2012 at 03:18:34PM +0800, zwu.ker...@gmail.com wrote: 20 files changed, 2275 insertions(+), 1 deletions(-) create mode 100644 fs

Re: [RFC v4 Patch 0/4] fs/inode.c: optimization for inode lock usage

2012-09-21 Thread Dave Chinner
a new one. So I don't think this is a good idea at all... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

Re: [PATCH 1/3] Add ratelimited printk for different alert levels

2012-09-11 Thread Dave Chinner
. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH 2/3] XFS: Print error when xfs_ialloc_ag_select fails to find continuous free space.

2012-09-11 Thread Dave Chinner
a NULLAGNUMBER returned, the caller decided whether to emit an error message or not. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: [PATCH 3/3] XFS: Print error when unable to allocate inodes or out of free inodes.

2012-09-11 Thread Dave Chinner
on a loop break is to allocate inodes, not return ENOSPC. BTW, there's no need to cc LKML for XFS specific patches. LKML is noisy enough as it is without unnecessary cross-posts Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH 1/3] Add ratelimited printk for different alert levels

2012-09-12 Thread Dave Chinner
for the current usage - ratelimiting is not widespread so there isn't a massive increase in size as a result of this. If we do start to use ratelimiting in lots of places in XFS, then we might have to revisit this, but it's OK for now. Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: [PATCH 0/8] Set bi_rw when alloc bio before call bio_add_page.

2012-07-30 Thread Dave Chinner
data telling me it is worthwhile, and it's a lot of code to churn for no benefit Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: [PATCH 1/1] xfs: check for possible overflow in xfs_ioc_trim

2012-07-30 Thread Dave Chinner
= BTOBB(range.start); end = start + BTOBBT(range.len) - 1; minlen = BTOBB(max_t(u64, granularity, range.minlen)); And that will prevent the overflow in BTOBB() just as effectively... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line

Re: Re: [PATCH 0/8] Set bi_rw when alloc bio before call bio_add_page.

2012-07-30 Thread Dave Chinner
On Tue, Jul 31, 2012 at 08:55:59AM +0800, majianpeng wrote: On 2012-07-31 05:42 Dave Chinner da...@fromorbit.com Wrote: On Mon, Jul 30, 2012 at 03:14:28PM +0800, majianpeng wrote: When exec bio_alloc, the bi_rw is zero.But after calling bio_add_page, it will use bi_rw. Fox example

Re: [PATCH] userns: Add basic quota support v4

2012-08-28 Thread Dave Chinner
for someone to tread on Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-08-28 Thread Dave Chinner
On Mon, Aug 27, 2012 at 06:05:06PM -0400, Naoya Horiguchi wrote: On Mon, Aug 27, 2012 at 08:26:07AM +1000, Dave Chinner wrote: On Fri, Aug 24, 2012 at 01:24:16PM -0400, Naoya Horiguchi wrote: Let me explain more to clarify my whole scenario. If a memory error hits on a dirty pagecache

Re: [PATCH] userns: Add basic quota support v4

2012-08-30 Thread Dave Chinner
On Wed, Aug 29, 2012 at 02:31:26AM -0700, Eric W. Biederman wrote: Dave thanks for taking the time to take a detailed look at this code. Dave Chinner da...@fromorbit.com writes: On Tue, Aug 28, 2012 at 12:09:56PM -0700, Eric W. Biederman wrote: Add the data type struct kqid which

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-09-02 Thread Dave Chinner
On Wed, Aug 29, 2012 at 02:32:04PM +0900, Jun'ichi Nomura wrote: On 08/29/12 11:59, Dave Chinner wrote: On Mon, Aug 27, 2012 at 06:05:06PM -0400, Naoya Horiguchi wrote: And yes, I understand it's ideal, but many applications choose not to do that for performance reason. So I think it's

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-02 Thread Dave Chinner
an XFS bug), or we can face the reality that storage stacks have become so complex that 8k is no longer a big enough stack for a modern system Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message

Re: 3.5.2: moving files from xfs/disk - nfs: radix_tree_lookup_slot+0xe/0x10

2012-09-02 Thread Dave Chinner
call has taken longer than 120s. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-09-04 Thread Dave Chinner
issue at hand - a better solution for everyone might pop up Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-08-23 Thread Dave Chinner
that is poisoned, and we truncate that away? Shouldn't that clear the poisoned bit? What about a hole punch over the poisoned range? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-08-23 Thread Dave Chinner
On Thu, Aug 23, 2012 at 10:39:32PM -0400, Naoya Horiguchi wrote: On Fri, Aug 24, 2012 at 11:31:18AM +1000, Dave Chinner wrote: On Wed, Aug 22, 2012 at 11:17:35AM -0400, Naoya Horiguchi wrote: HWPOISON: report sticky EIO for poisoned file still has a corner case where we have possibilities

Re: [PATCH 3/3] HWPOISON: prevent inode cache removal to keep AS_HWPOISON sticky

2012-08-26 Thread Dave Chinner
On Fri, Aug 24, 2012 at 01:24:16PM -0400, Naoya Horiguchi wrote: On Fri, Aug 24, 2012 at 02:39:17PM +1000, Dave Chinner wrote: On Thu, Aug 23, 2012 at 10:39:32PM -0400, Naoya Horiguchi wrote: On Fri, Aug 24, 2012 at 11:31:18AM +1000, Dave Chinner wrote: On Wed, Aug 22, 2012 at 11:17:35AM

Re: [PATCH 3/3] writeback: add dirty_ratio_time per bdi variable (NFS write performance)

2012-08-19 Thread Dave Chinner
. So the dirty_background_time implementation based on it will not always work to the user expectations. One important case is, some users (eg. Dave Chinner) explicitly take advantage of the existing behavior to quickly create delete a big 1GB temp file without worrying about triggering

Re: [PATCH 1/3] tmpfs: revert SEEK_DATA and SEEK_HOLE

2012-07-11 Thread Dave Chinner
/data is still shiny new and lots of developers aren't even aware of it's presence in recent kernels. Removing new functionality saying no-one is using it is like smashing the egg before the chicken hatches (or is it cutting of the chickes's head before it lays the egg?). Cheers, Dave. -- Dave

Re: [RFC][PATCH] Make io_submit non-blocking

2012-07-24 Thread Dave Chinner
results for other filesystems as well (xfs, btrfs are typical), as they may not have the same problems as ext4 or react the same way to your change. The result might simply be it is 20% slower Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-05 Thread Dave Chinner
the go-ahead to push random unreviewed syscall changes through subsystem trees after holding private discussions between a handful of developers like has happened here. Cheers. Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-05 Thread Dave Chinner
exactly the same thing Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http

Re: PATCH reduce impact of FIFREEZE on userland processes

2012-12-06 Thread Dave Chinner
. If you are really concerned by minimising the amount of time it takes to freeze, then syncfs; fsfreeze -f; fsfreeze -u will get you exactly the same result as your patch, without having any bad side effects for other users Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from

Re: [PATCH] Update atime from future.

2012-12-06 Thread Dave Chinner
. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-06 Thread Dave Chinner
the chance of collisions even if it isn't formally reserved.) struct ext4_ioc_falloc { ... }; /* security hole reserved for out-of-tree patches. */ #define EXT4_IOC_FALLOC_NOHIDE _IOW('f', 1, struct ext4_ioc_falloc) Done. Not so hard, is it? Cheers, Dave. -- Dave Chinner da

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-06 Thread Dave Chinner
merits, then it shouldn't be made. Sending your patch through a back door becuse you don't think you can't defend it as you pass through the front door is simply *not acceptable*. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-06 Thread Dave Chinner
it hasn't even been asked Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-07 Thread Dave Chinner
, and we can't remove. There's many, many good reasons why a revert is the only sane thing to do at this point Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-07 Thread Dave Chinner
writing zeros (i.e. FALLOC_FL_WRITE_ZEROS) to allocating unwritten extents as there are workloads where one will always be clearly better than the other... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-07 Thread Dave Chinner
, but each metadata block might have gone through a million changes in memory since the last time it was written. Indeed, in that 30s, there would have been a few million random data writes so the metadata writes are well and truly lost in the noise... Cheers, Dave. -- Dave Chinner da...@fromorbit.com

Re: PATCH reduce impact of FIFREEZE on userland processes

2012-12-07 Thread Dave Chinner
On Fri, Dec 07, 2012 at 08:59:52AM +, Alun wrote: Dave Chinner da...@fromorbit.com said, in message 20121207004255.GC27172@dastard: The problem wth doing this is that the sync can delay the freeze process by quite some time under the exact conditions you describe. If you want freeze

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-07 Thread Dave Chinner
On Fri, Dec 07, 2012 at 06:52:51PM -0800, Joel Becker wrote: On Sat, Dec 08, 2012 at 11:39:36AM +1100, Dave Chinner wrote: On Fri, Dec 07, 2012 at 05:02:32PM -0500, Ric Wheeler wrote: On 12/07/2012 04:57 PM, Theodore Ts'o wrote: On Fri, Dec 07, 2012 at 04:42:06PM -0500, Ric Wheeler wrote

Re: PATCH reduce impact of FIFREEZE on userland processes

2012-12-09 Thread Dave Chinner
On Sat, Dec 08, 2012 at 07:12:04AM -0500, Christoph Hellwig wrote: On Fri, Dec 07, 2012 at 11:42:55AM +1100, Dave Chinner wrote: The problem wth doing this is that the sync can delay the freeze process by quite some time under the exact conditions you describe. If you want freeze to take

Re: PATCH reduce impact of FIFREEZE on userland processes

2012-12-09 Thread Dave Chinner
On Sat, Dec 08, 2012 at 08:47:34AM +, Alun wrote: On Sat, 8 Dec 2012 12:20:29 +1100 Dave Chinner da...@fromorbit.com wrote: First off, thanks for the examples. I'll answer your one question and then I'll shut up! I'll try and chase this up by submitting patches to lvcreate

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-12-10 Thread Dave Chinner
On Mon, Dec 10, 2012 at 12:37:39PM -0500, Theodore Ts'o wrote: On Sat, Dec 08, 2012 at 11:17:05AM +1100, Dave Chinner wrote: I wouldn't recommend XFS_IOC_ALLOCSP as a user-friendly interface. The concept, however, implemented by a new fallocate() flag (say FALLOC_FL_WRITE_ZEROS) so

Re: Hang in XFS reclaim on 3.7.0-rc3

2012-11-19 Thread Dave Chinner
On Mon, Nov 19, 2012 at 07:50:06AM +0100, Torsten Kaiser wrote: On Mon, Nov 19, 2012 at 12:51 AM, Dave Chinner da...@fromorbit.com wrote: On Sun, Nov 18, 2012 at 04:29:22PM +0100, Torsten Kaiser wrote: On Sun, Nov 18, 2012 at 11:24 AM, Torsten Kaiser just.for.l...@googlemail.com wrote

Re: [PATCH 3/9] xfs: factor out everything but the filemap_write_and_wait from xfs_file_fsync

2012-11-20 Thread Dave Chinner
/xfs_file.c is. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org

Re: [PATCH 4/9] xfs: honor the O_SYNC flag for aysnchronous direct I/O requests

2012-11-20 Thread Dave Chinner
workqueue_struct *m_cil_workqueue; + struct workqueue_struct *m_aio_blkdev_flush_wq; struct workqueue_struct *m_aio_fsync_wq; Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: [PATCH 4/9] xfs: honor the O_SYNC flag for aysnchronous direct I/O requests

2012-11-20 Thread Dave Chinner
On Tue, Nov 20, 2012 at 02:42:48PM -0500, Jeff Moyer wrote: Dave Chinner da...@fromorbit.com writes: And requeuing work from one workqueue to the next is something that we can avoid. We know at IO submission time (i.e. xfs_vm_direct_io)) whether an fsync completion is going to be needed

Re: Hang in XFS reclaim on 3.7.0-rc3

2012-11-20 Thread Dave Chinner
On Tue, Nov 20, 2012 at 08:45:03PM +0100, Torsten Kaiser wrote: On Tue, Nov 20, 2012 at 12:53 AM, Dave Chinner da...@fromorbit.com wrote: [8108137e] mark_held_locks+0x7e/0x130 [81081a63] lockdep_trace_alloc+0x63/0xc0 [810e9dd5] kmem_cache_alloc

Re: The bug of iput() removal from flusher thread?

2012-11-20 Thread Dave Chinner
) } EXPORT_SYMBOL(ihold); -static void inode_lru_list_add(struct inode *inode) +void inode_lru_list_add(struct inode *inode) the inode lru list function can stay static. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: [PATCH] sendfile: allows bypassing of notifier events

2012-11-20 Thread Dave Chinner
reports are good references, but they don't replace a properly written commit message. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http

Re: [PATCH RFC 10/12] userns: Convert xfs to use kuid/kgid/kprojid where appropriate

2012-11-20 Thread Dave Chinner
: Alex Elder el...@kernel.org Cc: Dave Chinner da...@fromorbit.com Signed-off-by: Eric W. Biederman ebied...@xmission.com --- . @@ -614,12 +627,12 @@ int xfs_qm_dqget( xfs_mount_t *mp, xfs_inode_t *ip, /* locked inode (optional) */ - xfs_dqid_t id

Re: [PATCH RFC 0/12] Final userns conversions

2012-11-20 Thread Dave Chinner
of validation that is required. Also, it is likely that there are significant conflicts against changes already staged for 3.8, so getting it upstream through the XFS tree is the only option that I'd consider acceptible... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from

Re: The bug of iput() removal from flusher thread?

2012-11-20 Thread Dave Chinner
. There's no point putting it on the LRU if we are writing from iput_final() Otherwise looks OK. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info

Re: [RFC PATCH] mm: trace filemap add and del

2012-11-20 Thread Dave Chinner
this convention, so we should keep propagating that pattern in the name of consistency, rather than having different trace formats for different parts of the VFS/FS layers... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel

Re: The bug of iput() removal from flusher thread?

2012-11-21 Thread Dave Chinner
be no new references to the inodes occurring, and hence we don't need to hold the lock to serialise against new references being taken Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord

Re: [PATCH v4 27/31] gfs2: Convert aio_read/write ops to read/write_iter

2012-11-22 Thread Dave Chinner
looking at one of the XFS patches and was holding off commenting until I'd read the entire patch set. But, you've pointed out the same thing, so I figured I'd add my 2c here as well. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux

Re: [PATCH v4 16/31] loop: use aio to perform io on the underlying file

2012-11-22 Thread Dave Chinner
); +out: And this extra fsync is now not done in the aio path. I.e. the AIO completion path needs to issue the fsync to maintain correct REQ_FUA semantics... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body

Re: [PATCH 4/4] ext3: Warn if mounting rw on a disk requiring stable page writes

2012-11-22 Thread Dave Chinner
be used. i.e PG_owner_priv_1. I don't think using a glbal page flag for just this purpose will fly, though... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More

Re: [RFC PATCH] mm: trace filemap add and del

2012-11-22 Thread Dave Chinner
On Thu, Nov 22, 2012 at 12:51:00PM +0100, Robert Jarzmik wrote: Dave Chinner da...@fromorbit.com writes: We actually have an informal convention for formating filesystem trace events, and that is to use the device number +), + +TP_printk(page=%p pfn=%lu blk

[PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-11-25 Thread Dave Chinner
fs: revert commit bbdd6808 to fallocate UAPI From: Dave Chinner dchin...@redhat.com Commit bbdd6808 (fs: reserve fallocate flag codepoint) changes the fallocate(2) syscall interface. The flag that is reserved by this commit is for functionality that has previously been NAKed on the -fsdevel

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-11-26 Thread Dave Chinner
On Sun, Nov 25, 2012 at 09:55:20PM -0500, Theodore Ts'o wrote: On Mon, Nov 26, 2012 at 11:28:14AM +1100, Dave Chinner wrote: fs: revert commit bbdd6808 to fallocate UAPI From: Dave Chinner dchin...@redhat.com Commit bbdd6808 (fs: reserve fallocate flag codepoint) changes

Re: [PATCH, 3.7-rc7, RESEND] fs: revert commit bbdd6808 to fallocate UAPI

2012-11-26 Thread Dave Chinner
does. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

[PATCH 06/19] list: add a new LRU list type

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Several subsystems use the same construct for LRU lists - a list head, a spin lock and and item count. They also use exactly the same code for adding and removing items from the LRU. Create a generic type for these LRU lists. This is the beginning

[PATCH 14/19] xfs: use generic AG walk for background inode reclaim

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com The per-ag inode cache radix trees are not walked by the shrinkers any more, so there is no need for a special walker that contained heurisitcs to prevent multiple shrinker instances from colliding with each other. Hence we can just remote that and convert

[PATCH 18/19] shrinker: convert remaining shrinkers to count/scan API

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert the remaining couple of random shrinkers in the tree to the new API. Signed-off-by: Dave Chinner dchin...@redhat.com --- arch/x86/kvm/mmu.c | 35 +-- net/sunrpc/auth.c | 45

[PATCH 19/19] shrinker: Kill old -shrink API.

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com There are no more users of this API, so kill it dead, dead, dead and quietly bury the corpse in a shallow, unmarked grave in a dark forest deep in the hills... Signed-off-by: Dave Chinner dchin...@redhat.com --- include/linux/shrinker.h | 15

[PATCH 10/19] shrinker: add node awareness

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Pass the node of the current zone being reclaimed to shrink_slab(), allowing the shrinker control nodemask to be set appropriately for node aware shrinkers. Signed-off-by: Dave Chinner dchin...@redhat.com --- fs/drop_caches.c |1 + include

[PATCH 13/19] xfs: Node aware direct inode reclaim

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com XFS currently only tracks inodes for reclaim via tag bits in the inode cache radix tree. While this is awesome for background reclaim because it allows inodes to be reclaimed in ascending disk offset order, it sucks for direct memory reclaim which really

[PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert the driver shrinkers to the new API. Most changes are compile tested only because I either don't have the hardware or it's staging stuff. FWIW, the md and android code is pretty good, but the rest of it makes me want to claw my eyes out. The amount

[PATCH 01/19] dcache: convert dentry_stat.nr_unused to per-cpu counters

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Before we split up the dcache_lru_lock, the unused dentry counter needs to be made independent of the global dcache_lru_lock. Convert it to per-cpu counters to do this. Signed-off-by: Dave Chinner dchin...@redhat.com Reviewed-by: Christoph Hellwig h

[PATCH 16/19] fs: convert fs shrinkers to new scan/count API

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert the filesystem shrinkers to use the new API, and standardise some of the behaviours of the shrinkers at the same time. For example, nr_to_scan means the number of objects to scan, not the number of objects to free. I refactored the CIFS idmap

[PATCH 02/19] dentry: move to per-sb LRU locks

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com With the dentry LRUs being per-sb structures, there is no real need for a global dentry_lru_lock. The locking can be made more fine-grained by moving to a per-sb LRU lock, isolating the LRU operations of different filesytsems completely from each other

[PATCH 12/19] xfs: convert buftarg LRU to generic code

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert the buftarg LRU to use the new generic LRU list and take advantage of the functionality it supplies to make the buffer cache shrinker node aware. Signed-off-by: Dave Chinner dchin...@redhat.com --- fs/xfs/xfs_buf.c | 162

[PATCH 08/19] dcache: convert to use new lru list infrastructure

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Signed-off-by: Dave Chinner dchin...@redhat.com --- fs/dcache.c| 171 +--- fs/super.c | 10 +-- include/linux/fs.h | 15 +++-- 3 files changed, 82 insertions(+), 114 deletions(-) diff

[PATCH 09/19] list_lru: per-node list infrastructure

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Now that we have an LRU list API, we can start to enhance the implementation. This splits the single LRU list into per-node lists and locks to enhance scalability. Items are placed on lists according to the node the memory belongs to. To make scanning

[PATCH 07/19] inode: convert inode lru list to generic lru list code.

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Signed-off-by: Dave Chinner dchin...@redhat.com --- fs/inode.c | 173 +--- fs/super.c | 11 ++-- include/linux/fs.h |6 +- 3 files changed, 75 insertions(+), 115 deletions(-) diff --git

[PATCH 03/19] dcache: remove dentries from LRU before putting on dispose list

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com One of the big problems with modifying the way the dcache shrinker and LRU implementation works is that the LRU is abused in several ways. One of these is shrink_dentry_list(). Basically, we can move a dentry off the LRU onto a different list without doing

[RFC, PATCH 00/19] Numa aware LRU lists and shrinkers

2012-11-27 Thread Dave Chinner
Hi Glauber, Here's a working version of my patchset for generic LRU lists and NUMA-aware shrinkers. There are several parts to this patch set. The NUMA aware shrinkers are based on having a generic node-based LRU list implementation, and there are subsystems that need to be converted to use

[PATCH 15/19] xfs: convert dquot cache lru to list_lru

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert the XFS dquot lru to use the list_lru construct and convert the shrinker to being node aware. Signed-off-by: Dave Chinner dchin...@redhat.com --- fs/xfs/xfs_dquot.c |7 +- fs/xfs/xfs_qm.c| 307

[PATCH 05/19] shrinker: convert superblock shrinkers to new API

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Convert superblock shrinker to use the new count/scan API, and propagate the API changes through to the filesystem callouts. The filesystem callouts already use a count/scan API, so it's just changing counters to longs to match the VM API. This requires

[PATCH 11/19] fs: convert inode and dentry shrinking to be node aware

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com Now that the shrinker is passing a nodemask in the scan control structure, we can pass this to the the generic LRU list code to isolate reclaim to the lists on matching nodes. This requires a small amount of refactoring of the LRU list API, which might

[PATCH 04/19] mm: new shrinker API

2012-11-27 Thread Dave Chinner
From: Dave Chinner dchin...@redhat.com The current shrinker callout API uses an a single shrinker call for multiple functions. To determine the function, a special magical value is passed in a parameter to change the behaviour. This complicates the implementation and return value specification

Re: [PATCH 17/19] drivers: convert shrinkers to new count/scan API

2012-11-27 Thread Dave Chinner
On Wed, Nov 28, 2012 at 01:13:11AM +, Chris Wilson wrote: On Wed, 28 Nov 2012 10:14:44 +1100, Dave Chinner da...@fromorbit.com wrote: +/* + * XXX: (dchinner) This is one of the worst cases of shrinker abuse I've seen. + * + * i915_gem_purge() expects a byte count to be passed

<    1   2   3   4   5   6   7   8   9   10   >