Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently

2017-12-20 Thread Jan Kara
version would be: Update: modify inode inode_maybe_inc_iversion(inode) Read: my_version = inode_query_iversion(inode) get inode data And you need to make sure 'get inode data' does not get speculatively evaluated before you actually sample i_version so that you are guaranteed that if data changes, you will observe larger i_version in the future. Also please add a comment smp_mb() in inode_maybe_inc_iversion() like: /* This barrier pairs with the barrier in inode_query_iversion() */ and a similar comment to inode_query_iversion(). Because memory barriers make sense only in pairs (see SMP BARRIER PAIRING in Documentation/memory-barriers.txt). Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 14/15] dax: associate mappings with inodes, and warn if dma collides with truncate

2017-12-20 Thread Jan Kara
+0x2b7/0x3b0 > ? iomap_dio_zero+0x110/0x110 > iomap_apply+0xa4/0x110 > iomap_dio_rw+0x29e/0x3b0 > ? iomap_dio_zero+0x110/0x110 > ? xfs_file_dio_aio_read+0x7c/0x1a0 [xfs] > xfs_file_dio_aio_read+0x7c/0x1a0 [xfs] > xfs_file_read_iter+0xa0/0xc0 [xfs] > __vfs_read+0xf9/0x170 > vfs_read+0xa6/0x150 > SyS_pread64+0x93/0xb0 > entry_SYSCALL_64_fastpath+0x1f/0x96 Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] writeback: synchronize sync(2) against cgroup writeback membership switches

2017-12-19 Thread Jan Kara
On Wed 13-12-17 07:39:30, Tejun Heo wrote: > Hello, > > On Wed, Dec 13, 2017 at 12:00:04PM +0100, Jan Kara wrote: > > OK, but this effectively prevents writeback from sync_inodes_sb() to ever > > make inode switch wbs. Cannot that be abused in some way like making sure >

Re: [PATCH] NFS: allow name_to_handle_at() to work for Amazon EFS.

2017-12-19 Thread Jan Kara
; + size = MAX_HANDLE_SZ >> 2; > >> > >> - ret = exportfs_encode_inode_fh(inode, (struct fid > >> *)f.handle.f_handle, &size, 0); > >> + ret = exportfs_encode_inode_fh(inode, fhbuf, &size, 0); > >> if ((ret == FILEID_I

Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently

2017-12-19 Thread Jan Kara
On Mon 18-12-17 12:22:20, Jeff Layton wrote: > On Mon, 2017-12-18 at 17:34 +0100, Jan Kara wrote: > > On Mon 18-12-17 10:11:56, Jeff Layton wrote: > > > static inline bool > > > inode_maybe_inc_iversion(struct inode *inode, bool force) > > > { > >

Re: [PATCH v3 19/19] fs: handle inode->i_version more efficiently

2017-12-18 Thread Jan Kara
_VERSION_QUERIED; > + old = atomic64_cmpxchg(&inode->i_version, cur, new); > + if (old == cur) > + break; > + cur = old; > + } Why not just use atomic64_or() here? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v3 16/19] fs: only set S_VERSION when updating times if necessary

2017-12-18 Thread Jan Kara
t; + if (dirty) > iflags |= I_DIRTY_SYNC; > __mark_inode_dirty(inode, iflags); > return 0; > @@ -1863,7 +1871,7 @@ int file_update_time(struct file *file) > if (!timespec_equal(&inode->i_ctime, &now)) > sync_it |= S_CTIME; > > - if (IS_I_VERSION(inode)) > + if (IS_I_VERSION(inode) && inode_iversion_need_inc(inode)) > sync_it |= S_VERSION; > > if (!sync_it) > -- > 2.14.3 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 12/19] ocfs2: convert to new i_version API

2017-12-18 Thread Jan Kara
On Wed 13-12-17 09:20:10, Jeff Layton wrote: > From: Jeff Layton > > Signed-off-by: Jeff Layton Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/ocfs2/dir.c | 14 +++

Re: [PATCH 08/19] ext2: convert to new i_version API

2017-12-18 Thread Jan Kara
On Wed 13-12-17 09:20:06, Jeff Layton wrote: > From: Jeff Layton > > Signed-off-by: Jeff Layton Looks good. You can add: Reviwed-by: Jan Kara Honza > --- > fs/ext2/dir.c | 8 > fs/ext2/super.c |

Re: [PATCH] mm: save/restore current->journal_info in handle_mm_fault

2017-12-15 Thread Jan Kara
On Fri 15-12-17 09:17:42, Yan, Zheng wrote: > On Fri, Dec 15, 2017 at 12:53 AM, Jan Kara wrote: > >> > > >> > In this particular case I'm not sure why does ceph pass 'filp' into > >> > readpage() / readpages() handler when it already

Re: [PATCH] mm: save/restore current->journal_info in handle_mm_fault

2017-12-14 Thread Jan Kara
On Thu 14-12-17 22:30:26, Yan, Zheng wrote: > On Thu, Dec 14, 2017 at 9:43 PM, Jan Kara wrote: > > On Thu 14-12-17 18:55:27, Yan, Zheng wrote: > >> We recently got an Oops report: > >> > >> BUG: unable to handle kernel NULL pointer dereference at (null) >

Re: [PATCH] mm: save/restore current->journal_info in handle_mm_fault

2017-12-14 Thread Jan Kara
to read/set > + * current->journal_info. > + */ > + old_journal_info = current->journal_info; > + current->journal_info = NULL; > + > if (unlikely(is_vm_hugetlb_page(vma))) > ret = hugetlb_fault(vma->vm_mm, vma, address, flags); > else > ret = __handle_mm_fault(vma, address, flags); > > + current->journal_info = old_journal_info; > + > if (flags & FAULT_FLAG_USER) { > mem_cgroup_oom_disable(); > /* > -- > 2.13.6 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] writeback: synchronize sync(2) against cgroup writeback membership switches

2017-12-13 Thread Jan Kara
CONFIG_CGROUP_WRITEBACK > struct radix_tree_root cgwb_tree; /* radix tree of active cgroup wbs */ > struct rb_root cgwb_congested_tree; /* their congested states */ > + struct rw_semaphore wb_switch_rwsem; /* no cgwb switch while syncing */ > #else > struct bdi_writeback_congested *wb_congested; > #endif > --- a/mm/backing-dev.c > +++ b/mm/backing-dev.c > @@ -706,6 +706,7 @@ static int cgwb_bdi_init(struct backing_ > > INIT_RADIX_TREE(&bdi->cgwb_tree, GFP_ATOMIC); > bdi->cgwb_congested_tree = RB_ROOT; > + init_rwsem(&bdi->wb_switch_rwsem); > > ret = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL); > if (!ret) { -- Jan Kara SUSE Labs, CR

Re: possible deadlock in generic_file_write_iter (2)

2017-12-05 Thread Jan Kara
Hello Byungchul, On Tue 05-12-17 13:58:09, Byungchul Park wrote: > On 12/4/2017 5:33 PM, Jan Kara wrote: > >adding Peter and Byungchul to CC since the lockdep report just looks > >strange and cross-release seems to be involved. Guys, how did #5 get into > >the loc

Re: regression: 4.13 cannot follow symlinks on some ext3 fs

2017-12-04 Thread Jan Kara
m-r5 (none):~# stat /usr/share/terminfo/x/xterm-r5 File: `/usr/share/terminfo/x/xterm-r5' -> `/lib/terminfo/x/xterm-r5' Size: 24 Blocks: 8 IO Block: 4096 symbolic link Device: 6200h/25088dInode: 98027 Links: 1 Access: (0777/lrwxrwxrwx) Uid: (0/root) Gid: (0/root) Access: 2017-12-04 16:27:29.0 + Modify: 2006-05-19 21:12:53.0 + Change: 2006-05-19 21:12:53.0 + Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 03/11] fs: add frozen sb state helpers

2017-12-01 Thread Jan Kara
On Thu 30-11-17 20:05:48, Luis R. Rodriguez wrote: > On Thu, Nov 30, 2017 at 06:13:10PM +0100, Jan Kara wrote: > > ... I dislike the _by_user() suffix as there may be different places that > > call freeze_super() (e.g. device mapper does this during some operations). > &g

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-12-01 Thread Jan Kara
On Wed 29-11-17 13:38:26, Chris Mason wrote: > On 11/29/2017 12:05 PM, Tejun Heo wrote: > >On Wed, Nov 29, 2017 at 09:03:30AM -0800, Tejun Heo wrote: > >>Hello, > >> > >>On Wed, Nov 29, 2017 at 05:56:08PM +0100, Jan Kara wrote: > >>>What has happene

Re: [PATCH 03/11] fs: add frozen sb state helpers

2017-11-30 Thread Jan Kara
le cleanup if freezing of all superblocks fails in the middle. So I'm not 100% this works out nicely in the end. But it's certainly worth a consideration. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 02/11] fs: provide unlocked helper thaw_super()

2017-11-30 Thread Jan Kara
active count management. > > This change has no functional changes. > > Suggested-by: Dave Chinner > Signed-off-by: Luis R. Rodriguez Looks good to me. You can add: Reviewed-by: Jan Kara Honz

Re: [PATCH 01/11] fs: provide unlocked helper for freeze_super()

2017-11-30 Thread Jan Kara
g and active count management. > > This change has no functional changes. > > Suggested-by: Dave Chinner > Signed-off-by: Luis R. Rodriguez Looks good to me. You can add: Reviewed-by: Jan Kara Honza

Re: [PATCH 05/11] fs: add iterate_supers_excl() and iterate_supers_reverse_excl()

2017-11-30 Thread Jan Kara
his but also also captures any errors encountered. > > Signed-off-by: Luis R. Rodriguez The patch looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > fs/super.c | 91 >

Re: [PATCH 07/11] xfs: remove not needed freezing calls

2017-11-30 Thread Jan Kara
k); > > if (tout) > - freezable_schedule_timeout(msecs_to_jiffies(tout)); > + schedule_timeout(msecs_to_jiffies(tout)); > > __set_current_state(TASK_RUNNING); > > - try_to_freeze(); > - > tout = xfsaild_push(ailp); > } > > -- > 2.15.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCHSET v2] cgroup, writeback, btrfs: make sure btrfs issues metadata IOs from the root cgroup

2017-11-29 Thread Jan Kara
> fs/ext2/inode.c |3 ++- > fs/ext2/super.c |1 - > fs/ext4/inode.c |5 - > fs/ext4/super.c |2 -- > include/linux/backing-dev.h |2 +- > include/linux/buffer_head.h |3 ++

Re: [PATCH] quota: Check for register_shrinker() failure.

2017-11-29 Thread Jan Kara
> Signed-off-by: Tetsuo Handa > > Fixes: 1d3d4437eae1 ("vmscan: per-node deferred work") > > > Cc: Jan Kara > > Cc: Michal Hocko > > From my very limited understanding of the code this looks

Re: [PATCH v3] quota: propagate error from __dquot_initialize

2017-11-28 Thread Jan Kara
successfully, we can make sure all inodes disk usage > can be accounted, which will be more reasonable. > > Suggested-by: Jan Kara > Signed-off-by: Chao Yu Thanks. Added to my tree. Honza > --- &g

Re: [PATCH v2 3/4] [media] v4l2: disable filesystem-dax mapping support

2017-11-27 Thread Jan Kara
l can > coordinate revoking DMA access when the filesystem needs to truncate > mappings. > > Reported-by: Jan Kara > Cc: Mauro Carvalho Chehab > Cc: linux-me...@vger.kernel.org > Cc: > Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")

Re: [PATCH v2 2/4] mm: fail get_vaddr_frames() for filesystem-dax mappings

2017-11-27 Thread Jan Kara
Cc: Inki Dae > Cc: Seung-Woo Kim > Cc: Joonyoung Shim > Cc: Kyungmin Park > Cc: Mauro Carvalho Chehab > Cc: linux-me...@vger.kernel.org > Cc: Jan Kara > Cc: Mel Gorman > Cc: Vlastimil Babka > Cc: Andrew Morton > Cc: > Fixes: 3565fce3a659 ("mm, x86:

Re: [PATCH v2] quota: propagate error from __dquot_initialize

2017-11-27 Thread Jan Kara
to just pass 'flags' here. Other than that the patch looks good. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/3] lockdep: Apply crossrelease to PG_locked locks

2017-11-24 Thread Jan Kara
memory 4 cpus": > > > >make clean > >echo 3 > drop_caches > >time make -j4 > > Maybe FS people will help you find a more representative workload. E.g. > linear cache cold file read should be good as well. Maybe there are some > tests in fstests

Re: [PATCH] fs: handle shrinker registration failure in sget_userns

2017-11-23 Thread Jan Kara
y to trigger in the production because small > allocations do not fail usually. > > Debugged-by: Tetsuo Handa > Signed-off-by: Michal Hocko Looks good to me now. You can add: Reviewed-by: Jan Kara Honza > --- > fs/

Re: [PATCH] fs: handle shrinker registration failure in sget_userns

2017-11-23 Thread Jan Kara
if (err) { > spin_unlock(&sb_lock); > + unregister_shrinker(&s->s_shrink); > destroy_unused_super(s); > return ERR_PTR(err); > } > @@ -518,7 +525,6 @@ struct super_block *sget_userns(struct file_system_type > *type, > hlist_add_head(&s->s_instances, &type->fs_supers); > spin_unlock(&sb_lock); > get_filesystem(type); > - register_shrinker(&s->s_shrink); > return s; > } > > -- > 2.15.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: propagate error from __dquot_initialize

2017-11-21 Thread Jan Kara
gt;files[type] = NULL; > iput(inode); This bail out path is not correct. You have to go through full quota off at this point (dquot_disable() function) as some inodes had already quotas initialized and can be using them... Honza -- Jan Kara SUSE Labs, CR

Re: KASAN: use-after-free in move_expired_inodes

2017-11-15 Thread Jan Kara
> the crash. > > > > Programs can be found here: https://pastebin.com/RYGtNn3z > > > > Stack trace here: https://pastebin.com/SaJXWMg3 > > > > We don't have a C reproducer but we will send one if we have it. > > > > Regards, > > Shankara > -- Jan Kara SUSE Labs, CR

Re: [PATCH] reiserfs: remove unneeded i_version bump

2017-11-15 Thread Jan Kara
;i_version++; > inode->i_mtime = inode->i_ctime = current_time(inode); > mark_inode_dirty(inode); > return len - towrite; > -- > 2.13.6 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

2017-11-15 Thread Jan Kara
On Wed 15-11-17 01:32:16, Yang Shi wrote: > > > On 11/14/17 1:39 AM, Michal Hocko wrote: > >On Tue 14-11-17 03:10:22, Yang Shi wrote: > >> > >> > >>On 11/9/17 5:54 AM, Michal Hocko wrote: > >>>[Sorry for the late reply] > >>> &g

Re: [PATCH] quota: be aware of error from dquot_initialize

2017-11-14 Thread Jan Kara
On Tue 14-11-17 11:43:49, Chao Yu wrote: > On 2017/11/13 17:18, Jan Kara wrote: > > On Mon 13-11-17 11:31:48, Chao Yu wrote: > >> Commit 6184fc0b8dd7 ("quota: Propagate error from ->acquire_dquot()") > >> missed to handle error from dquot_initialize in dquot

Re: linux-next: error fetching the vfs-jk tree

2017-11-14 Thread Jan Kara
hat tree? Sorry, I forgot you were still fetching it. No, there's no need to fetch that branch anymore (it was a branch for one time work and I've deleted it now since it was untouched for an year or so). Thanks! Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] quota: be aware of error from dquot_initialize

2017-11-13 Thread Jan Kara
+ error = dquot_initialize(inode); > return error; > } > EXPORT_SYMBOL(dquot_file_open); > -- > 2.15.0.55.gc2ece9dc4de6 > > -- Jan Kara SUSE Labs, CR

Re: [inotify_read] BUG: unable to handle kernel paging request at ffff8800172f8000

2017-11-07 Thread Jan Kara
ception > > Attached the full dmesg and kconfig. Ok, I assume this is still valid even though previous KASAN report need not be? I'm not sure if this could be inotify related though... Possibly if double-free could trigger this in SLOB but then we should see issues also with SLAB or SLUB. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v4] lib/dlock-list: Scale dlock_lists_empty()

2017-11-07 Thread Jan Kara
s good to me. You can add: Reviewed-by: Jan Kara Honza Kara > --- > Changes from v3: > - s/waiters/used_lists, more doc around the counter. > - fixed racy scenario when the list empty/non-empty > condition ch

Re: possible deadlock in generic_file_write_iter

2017-11-06 Thread Jan Kara
ou'd need to have a completely separate set of locking classes for each filesystem to avoid false positives like these. And that would increase number of classes lockdep has to handle significantly. So I'm not sure it's really worth it... Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] writeback: remove the unused function parameter

2017-11-06 Thread Jan Kara
On Fri 03-11-17 01:04:45, Wang Long wrote: > The parameter `struct bdi_writeback *wb` is not been used in the function > body. so we just remove it. > > Signed-off-by: Wang Long Looks good. You can add: Reviewed-

Re: [PATCH v2] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Jan Kara
sole_lock owner. > + */ > + mutex_release(&console_lock_dep_map, 1, _THIS_IP_); > + printk_safe_exit_irqrestore(flags); > + /* Note, if waiter is set, logbuf_lock is not held */ > + return; > + } > + > console_locked = 0; > > /* Release the exclusive_console once it is used */ -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

2017-11-01 Thread Jan Kara
On Wed 01-11-17 00:44:18, Yang Shi wrote: > On 10/31/17 3:12 AM, Jan Kara wrote: > >On Tue 31-10-17 00:39:58, Yang Shi wrote: > >>On 10/30/17 5:43 AM, Jan Kara wrote: > >>>On Sat 28-10-17 02:22:18, Yang Shi wrote: > >>>>If some process generates e

Re: [PATCH v8 5/6] lib/dlock-list: Enable faster lookup with hashing

2017-11-01 Thread Jan Kara
struct dlock_list_node *node, > { > struct dlock_list_head *head = &dlist->heads[this_cpu_read(cpu2idx)]; > > - /* > - * There is no need to disable preemption > - */ > - spin_lock(&head->lock); > - node->head = head; > - list_add(&node->list, &head->list); > - spin_unlock(&head->lock); > + dlock_list_add(node, head); > } > EXPORT_SYMBOL(dlock_lists_add); > > -- > 1.8.3.1 > > -- Jan Kara SUSE Labs, CR

Re: [PATCH v8 4/6] lib/dlock-list: Make sibling CPUs share the same linked list

2017-11-01 Thread Jan Kara
t on performance. It > also improves dlock list iteration performance as fewer lists need > to be iterated. > > Signed-off-by: Waiman Long The patch looks good to me. You can add: Reviewed-by: Jan Kara

Re: [PATCH v2 2/2] isofs: use unsigned char types consistently

2017-10-31 Thread Jan Kara
char length [ISODCL (1, 1)]; /* 711 */ > - char ext_attr_length[ISODCL (2, 2)]; /* 711 */ > - char extent [ISODCL (3, 10)]; /* 733 */ > - char size [ISODCL (11, 18)]; /* 733 */ > - char date [ISODCL (19, 25)]; /* 7 by 711 */ > - char flags [ISODCL (26, 26)]; > - char file_unit_size [ISODCL (27, 27)]; /* 711 */ > - char interleave [ISODCL (28, 28)]; /* 711 */ > - char volume_sequence_number [ISODCL (29, 32)]; /* 723 */ > - unsigned char name_len [ISODCL (33, 33)]; /* 711 */ > + __u8 length [ISODCL (1, 1)]; /* 711 */ > + __u8 ext_attr_length[ISODCL (2, 2)]; /* 711 */ > + __u8 extent [ISODCL (3, 10)]; /* 733 */ > + __u8 size [ISODCL (11, 18)]; /* 733 */ > + __u8 date [ISODCL (19, 25)]; /* 7 by 711 */ > + __u8 flags [ISODCL (26, 26)]; > + __u8 file_unit_size [ISODCL (27, 27)]; /* 711 */ > + __u8 interleave [ISODCL (28, 28)]; /* 711 */ > + __u8 volume_sequence_number [ISODCL (29, 32)]; /* 723 */ > + __u8 name_len [ISODCL (33, 33)]; /* 711 */ > char name [0]; > } __attribute__((packed)); > > -- > 2.9.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 1/2] isofs: fix timestamps beyond 2027

2017-10-31 Thread Jan Kara
On Thu 19-10-17 17:29:12, Arnd Bergmann wrote: > On Thu, Oct 19, 2017 at 5:17 PM, Jan Kara wrote: > > On Thu 19-10-17 16:47:48, Arnd Bergmann wrote: > >> isofs uses a 'char' variable to load the number of years since > >> 1900 for an inode timestamp. On arch

Re: [RFC PATCH] fs: fsnotify: account fsnotify metadata to kmemcg

2017-10-31 Thread Jan Kara
On Tue 31-10-17 13:51:40, Amir Goldstein wrote: > On Tue, Oct 31, 2017 at 12:50 PM, Jan Kara wrote: > > On Sun 22-10-17 11:24:17, Amir Goldstein wrote: > >> But I think there is another problem, not introduced by your change, but > >> could > >> be amplified b

Re: [PATCH v2 0/7] fix fanotify issues with the series in v4.12

2017-10-31 Thread Jan Kara
On Tue 31-10-17 13:02:21, Amir Goldstein wrote: > On Tue, Oct 31, 2017 at 11:54 AM, Jan Kara wrote: > > On Mon 30-10-17 21:18:09, Miklos Szeredi wrote: > >> On Mon, Oct 30, 2017 at 6:27 PM, Jan Kara wrote: > >> > On Fri 27-10-17 13:53:20, Jan Kara wrote: > >&

Re: [PATCH 2/2] fsnotify: convert fsnotify_mark.refcnt from atomic_t to refcount_t

2017-10-31 Thread Jan Kara
that found and may be using this mark. */ > - atomic_t refcnt; > + refcount_t refcnt; > /* Group this mark is for. Set on mark creation, stable until last ref >* is dropped */ > struct fsnotify_group *group; > diff --git a/kernel/audit_tree.c b/kernel/audit_tree.c > index 011d46e..45ec960 100644 > --- a/kernel/audit_tree.c > +++ b/kernel/audit_tree.c > @@ -1007,7 +1007,7 @@ static void audit_tree_freeing_mark(struct > fsnotify_mark *entry, struct fsnotify >* We are guaranteed to have at least one reference to the mark from >* either the inode or the caller of fsnotify_destroy_mark(). >*/ > - BUG_ON(atomic_read(&entry->refcnt) < 1); > + BUG_ON(refcount_read(&entry->refcnt) < 1); > } > > static const struct fsnotify_ops audit_tree_ops = { > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/2] fsnotify: convert fsnotify_group.refcnt from atomic_t to refcount_t

2017-10-31 Thread Jan Kara
inotify_init() and the refcnt will hit 0 only when that fd has been >* closed. >*/ > - atomic_t refcnt;/* things with interest in this group */ > + refcount_t refcnt; /* things with interest in this group */ > > const struct fsnotify_ops *ops; /* how this group handles things */ > > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR

Re: [RFC PATCH] fs: fsnotify: account fsnotify metadata to kmemcg

2017-10-31 Thread Jan Kara
n.c > +++ b/fs/notify/notification.c > @@ -111,7 +111,8 @@ int fsnotify_add_event(struct fsnotify_group *group, > return 2; > } > > - if (group->q_len >= group->max_events) { > + if (group->q_len >= group->max_events || > + event == group->overflow_event) { > ret = 2; > /* Queue overflow event only if it isn't already queued */ > if (!list_empty(&group->overflow_event->list)) { > -- > 2.7.4 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

2017-10-31 Thread Jan Kara
On Tue 31-10-17 00:39:58, Yang Shi wrote: > On 10/30/17 5:43 AM, Jan Kara wrote: > >On Sat 28-10-17 02:22:18, Yang Shi wrote: > >>If some process generates events into a huge or unlimit event queue, but no > >>listener read them, they may consume significant amount of me

Re: [PATCH v2 0/7] fix fanotify issues with the series in v4.12

2017-10-31 Thread Jan Kara
On Mon 30-10-17 21:18:09, Miklos Szeredi wrote: > On Mon, Oct 30, 2017 at 6:27 PM, Jan Kara wrote: > > On Fri 27-10-17 13:53:20, Jan Kara wrote: > >> On Wed 25-10-17 16:31:39, Miklos Szeredi wrote: > >> > On Wed, Oct 25, 2017 at 10:41 AM, Miklos Szeredi > >&g

Re: [PATCH v2 0/7] fix fanotify issues with the series in v4.12

2017-10-30 Thread Jan Kara
On Fri 27-10-17 13:53:20, Jan Kara wrote: > On Wed 25-10-17 16:31:39, Miklos Szeredi wrote: > > On Wed, Oct 25, 2017 at 10:41 AM, Miklos Szeredi > > wrote: > > > We discovered some problems in the latest fsnotify/fanotify codebase with > > > the help of a stre

Re: [PATCH v2 7/7] fanotify: clean up CONFIG_FANOTIFY_ACCESS_PERMISSIONS ifdefs

2017-10-30 Thread Jan Kara
ess_list); > -#endif > + if (IS_ENABLED(CONFIG_FANOTIFY_ACCESS_PERMISSIONS)) { > + init_waitqueue_head(&group->fanotify_data.access_waitq); > + INIT_LIST_HEAD(&group->fanotify_data.access_list); > + } When having space for these allocated, just initialize them properly. Otherwise it's asking for trouble. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 2/7] fsnotify: pin both inode and vfsmount mark

2017-10-30 Thread Jan Kara
On Mon 30-10-17 14:42:11, Miklos Szeredi wrote: > On Mon, Oct 30, 2017 at 2:34 PM, Jan Kara wrote: > > On Wed 25-10-17 10:41:34, Miklos Szeredi wrote: > >> We may fail to pin one of the marks in fsnotify_prepare_user_wait() when > >> dropping the srcu read lock, resulti

Re: [PATCH v2 4/7] fsnotify: skip unattached marks

2017-10-30 Thread Jan Kara
= srcu_dereference(inode_node->next, > &fsnotify_mark_srcu); > +skip_vfsmount: > if (vfsmount_group) > vfsmount_node = srcu_dereference(vfsmount_node->next, >&fsnotify_mark_srcu); > -- > 2.5.5 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 2/7] fsnotify: pin both inode and vfsmount mark

2017-10-30 Thread Jan Kara
} > > - iter_info.inode_mark = inode_mark; > - iter_info.vfsmount_mark = vfsmount_mark; > - > ret = send_to_group(to_tell, inode_mark, vfsmount_mark, mask, > data, data_is, cookie, file_name, > &iter_info); > -- > 2.5.5 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] fs: fsnotify: account fsnotify metadata to kmemcg

2017-10-30 Thread Jan Kara
know what it is doing. So maybe we could come up with some better way to control amount of resources consumed by notification events but for that we lack more information about your use case. And I maintain that the solution should account events to the consumer, not the producer...

Re: [PATCH v7 10/10] lib/dlock-list: Fix use-after-unlock problem in dlist_for_each_entry_safe()

2017-10-30 Thread Jan Kara
g Looks good to me. You can add: Reviewed-by: Jan Kara Honza > --- > include/linux/dlock-list.h | 28 +--- > 1 file changed, 17 insertions(+), 11 deletions(-) > > diff --git a/include/linux/d

Re: [PATCH v3 00/13] dax: fix dma vs truncate and remove 'page-less' support

2017-10-30 Thread Jan Kara
implement mechanism to block truncate while there are short term references pending (and for that retry loops would be IMHO acceptable). And then we can work on a mechanism to notify userspace that it needs to drop references to blocks that are going to be truncated so that we can re-enable taking of long term references. Honza [1] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1522887.html -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 0/7] fix fanotify issues with the series in v4.12

2017-10-27 Thread Jan Kara
ve a close look. I'll try to check it early next week and pick it up to my tree. Also thanks Amir for reviewing Miklos' patches! Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v3 00/13] dax: fix dma vs truncate and remove 'page-less' support

2017-10-26 Thread Jan Kara
ks either. So we are back at a situation where we need to detach blocks from the inode and then wait for page refs to be dropped - so some form of busy extents. Am I missing something? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2 1/2] isofs: fix timestamps beyond 2027

2017-10-19 Thread Jan Kara
nel.org > Signed-off-by: Arnd Bergmann ... > -int iso_date(char * p, int flag) > +int iso_date(u8 *p, int flag) > { > int year, month, day, hour, minute, second, tz; > int crtime; > > - year = p[0]; > + year = (int)(u8)p[0]; The cast seems unnecessa

Re: [PATCH 2/8] mm, truncate: Do not check mapping for every page being truncated

2017-10-19 Thread Jan Kara
e_node and as it no longer requires a mapping, the private > field is removed. > > Signed-off-by: Mel Gorman > Acked-by: Johannes Weiner The patch looks good to me. You can add: Reviewed-by: Jan Kara Honza

Re: [PATCH] f2fs: use extra parenthesis around assignment/condition

2017-10-17 Thread Jan Kara
@@ -1650,8 +1650,8 @@ int wait_on_node_pages_writeback(struct f2fs_sb_info > *sbi, nid_t ino) > > pagevec_init(&pvec, 0); > > - while (nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index, > - PAGECACHE_TAG_WRITEBACK)) { > + while ((nr_pages = pagevec_lookup_tag(&pvec, NODE_MAPPING(sbi), &index, > + PAGECACHE_TAG_WRITEBACK))) { > int i; > > for (i = 0; i < nr_pages; i++) { > -- > 2.9.0 > -- Jan Kara SUSE Labs, CR

Re: [PATCH] writeback: Convert timers to use timer_setup()

2017-10-17 Thread Jan Kara
On Mon 16-10-17 15:59:13, Kees Cook wrote: > In preparation for unconditionally passing the struct timer_list pointer to > all timer callbacks, switch to using the new timer_setup() and from_timer() > to pass the timer pointer explicitly. > > Cc: Andrew Morton > Cc: Jan Ka

Re: [PATCH v2 1/3] udf: Fix 64-bit sign extension issues affecting blocks > 0x7FFFFFFF

2017-10-16 Thread Jan Kara
s' command failing with EIO. > > * FIBMAP on a file block located above 0x7FFF can return a negative > value. The low 32 bits are correct, but applications that don't mask the > high 32 bits of the result can perform incorrectly. > > Per suggestion by Jan Kara,

Re: [PATCH] ext4: Convert timers to use timer_setup()

2017-10-12 Thread Jan Kara
; Cc: Andreas Dilger > Cc: linux-e...@vger.kernel.org > Cc: Thomas Gleixner > Signed-off-by: Kees Cook The patch looks good. You can add: Reviewed-by: Jan Kara Honza > --- > This requires commit 686fef928bba (&qu

Re: [PATCH 3/8] mm, truncate: Remove all exceptional entries from pagevec under one lock

2017-10-12 Thread Jan Kara
se cases. But it would be rather large overhaul of the code so it may be a bit out of scope for these improvements... > @@ -409,8 +445,8 @@ void truncate_inode_pages_range(struct address_space > *mapping, > } > > if (radix_tree_exceptional_entry(page)) { > - truncate_exceptional_entry(mapping, index, > -page); > + if (ei != PAGEVEC_SIZE) > + ei = i; This should be ei == PAGEVEC_SIZE I think. Otherwise the patch looks good to me so feel free to add: Reviewed-by: Jan Kara Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 2/8] mm, truncate: Do not check mapping for every page being truncated

2017-10-12 Thread Jan Kara
ax_mapping(mapping) || shmem_mapping(mapping)) > - return; > - Hum, we don't need to pass 'mapping' from call sites then? Either pass NULL or just remove the argument completely since nobody needs it anymore... Otherwise the patch looks good. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] udf: Fix 64-bit sign extension issues affecting blocks > 0x7FFFFFFF

2017-10-11 Thread Jan Kara
On Tue 10-10-17 22:30:30, Steve Magnani wrote: > Jan - > > On 10/10/2017 02:33 AM, Jan Kara wrote: > >On Mon 09-10-17 10:04:52, Steve Magnani wrote: > > > >...the patch seems to be mixing two changes into one which I'd prefer to be > > separate patches: &g

Re: [PATCH 4/5] cgroup, buffer_head: implement submit_bh_blkcg_css()

2017-10-11 Thread Jan Kara
On Tue 10-10-17 08:54:40, Tejun Heo wrote: > Implement submit_bh_blkcg_css() which will be used to override cgroup > membership on specific buffer_heads. > > v2: Reimplemented using create_bh_bio() as suggested by Jan. > > Signed-off-by: Tejun Heo > Cc: Jan Kara > Cc:

Re: [PATCH 3/5] buffer_head: separate out create_bh_bio() from submit_bh_wbc()

2017-10-11 Thread Jan Kara
As bio can now be manipulated before submitted, we can move out @wbc > handling into submit_bh_wbc() and similarly this will make adding more > submit_bh variants straight-forward. > > This patch is pure refactoring and doesn't cause any functional > changes. > > Signed

Re: [PATCH 1/5] blkcg: export blkcg_root_css

2017-10-11 Thread Jan Kara
On Tue 10-10-17 08:54:37, Tejun Heo wrote: > Export blkcg_root_css so that filesystem modules can use it. > > Signed-off-by: Tejun Heo Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > block/blk-cgro

Re: [PATCH] fs/ncpfs: Convert timers to use timer_setup()

2017-10-10 Thread Jan Kara
On Wed 04-10-17 17:52:50, Kees Cook wrote: > In preparation for unconditionally passing the struct timer_list pointer to > all timer callbacks, switch to using the new timer_setup() and from_timer() > to pass the timer pointer explicitly. > > Cc: Petr Vandrovec > Cc: Jan Kara

Re: [PATCH] jbd2: Convert timers to use timer_setup()

2017-10-10 Thread Jan Kara
On Wed 04-10-17 17:48:46, Kees Cook wrote: > In preparation for unconditionally passing the struct timer_list pointer to > all timer callbacks, switch to using the new timer_setup() and from_timer() > to pass the timer pointer explicitly. > > Cc: "Theodore Ts'o"

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-10 Thread Jan Kara
On Tue 10-10-17 17:14:48, Yafang Shao wrote: > 2017-10-10 16:48 GMT+08:00 Jan Kara : > > On Tue 10-10-17 16:00:29, Yafang Shao wrote: > >> 2017-10-10 6:42 GMT+08:00 Andrew Morton : > >> > On Sat, 7 Oct 2017 06:58:04 +0800 Yafang Shao > >> > wrote: >

Re: [PATCH 2/3] cgroup, writeback: implement submit_bh_blkcg_css()

2017-10-10 Thread Jan Kara
ut() to kick off >* regular writeback instead of writing things out itself. >*/ > - if (wbc->wb) > - bio_associate_blkcg(bio, wbc->wb->blkcg_css); > + if (wbc->blkcg_css) > + bio_associate_blkcg(bio, wbc->blkcg_css); > } > > #else/* CONFIG_CGROUP_WRITEBACK */ > -- > 2.9.5 > -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/3] cgroup, writeback: replace SB_I_CGROUPWB with per-inode S_CGROUPWB

2017-10-10 Thread Jan Kara
> > * btrfs sets the new flag in btrfs_update_iflags() function. Note > that this automatically excludes btree_inode which doesn't use > btrfs_update_iflags() during initialization. This is an intended > behavior change. > > Signed-off-by: Tejun Heo > Cc: Jan Kara

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-10 Thread Jan Kara
Maybe we'd better call wb_wakeup_delayed(wb) here to bypass the > bdi_has_dirty_io() check ? Well, wb_wakeup_delayed() would be more appropriate but you'd then have to iterate over all bdis and wbs to be able to call it which IMO isn't worth the pain for a special case like this. But the decision is worth mentioning in the comment. Also wakeup_flusher_threads() does in principle what you need - see my reply to Andrew for details. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-10 Thread Jan Kara
a strange thing to do). I guess to prevent busylooping? But I'm not sure... > (and what happens if the interval was set to 1 hour and the user > rewrites that to 1 second? Does that change take 1 hour to take > effect?) That's a good point I didn't think about. So probably we should do the wakeup whenever dirty_writeback_interval changes. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] udf: Fix 64-bit sign extension issues affecting blocks > 0x7FFFFFFF

2017-10-10 Thread Jan Kara
gt; - udf_debug("bit %ld already set\n", bit + i); > + udf_debug("bit %lu already set\n", bit + i); This change looks wrong - bit and i are signed. However they are ints, not longs, so that should indeed be fixed. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v7 4/6] lib/dlock-list: Make sibling CPUs share the same linked list

2017-10-09 Thread Jan Kara
if (list_empty(&iter->head[iter->index].list)) Why these two do not need a similar treatment as alloc_dlist_heads()? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH] ext2/super: Fix a possible sleep-in-atomic bug in parse_options

2017-10-09 Thread Jan Kara
nto some object and applied it only after the last > possible failure exit. The entire "restore the original state" logics > would go away... Well, it's not like the restore logic would be that difficult for ext2. But I agree that running the whole parsing logic under a spinlock is unnecessary and accumulating all the changes in one structure and then applying them looks like a cleaner way to go. I'll look into that. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH V2] writeback: merge try_to_writeback_inodes_sb_nr() into caller

2017-10-09 Thread Jan Kara
nodes_sb() which is the > only caller. Also change return type of try_to_writeback_inodes_sb to > void as the only user ext4 doesn't care. > > Signed-off-by: Rakesh Pandit Looks good. You can add: Reviewed-by: Jan Kara

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-09 Thread Jan Kara
On Mon 09-10-17 18:44:23, Yafang Shao wrote: > 2017-10-09 17:56 GMT+08:00 Jan Kara : > > On Sat 07-10-17 06:58:04, Yafang Shao wrote: > >> After disable periodic writeback by writing 0 to > >> dirty_writeback_centisecs, the handler wb_workfn() will not be > &g

Re: [PATCH] mm/page-writeback.c: fix bug caused by disable periodic writeback

2017-10-09 Thread Jan Kara
s has some changes queued in linux-block tree in this area so your change won't apply. So please base your changes on his tree. Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v2] block/laptop_mode: Convert timers to use timer_setup()

2017-10-09 Thread Jan Kara
user is still using the disk. > - */ > -void laptop_io_completion(struct backing_dev_info *info) > -{ > - mod_timer(&info->laptop_mode_wb_timer, jiffies + laptop_mode); > -} > - > -/* > - * We're in laptop mode and we've just synced. The sync's writes will have > - * caused another writeback to be scheduled by laptop_io_completion. > - * Nothing needs to be written back anymore, so we unschedule the writeback. > - */ > -void laptop_sync_completion(void) > -{ > - struct backing_dev_info *bdi; > - > - rcu_read_lock(); > - > - list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) > - del_timer(&bdi->laptop_mode_wb_timer); > - > - rcu_read_unlock(); > -} > -#endif > - > /* > * If ratelimit_pages is too high then we can get into dirty-data overload > * if a large number of processes all perform writes at the same time. > -- > 2.14.1 > -- Jan Kara SUSE Labs, CR

Re: [PATCH v6 4/6] lib/dlock-list: Make sibling CPUs share the same linked list

2017-10-09 Thread Jan Kara
On Thu 05-10-17 10:57:07, Waiman Long wrote: > On 10/05/2017 04:59 AM, Jan Kara wrote: > > On Wed 04-10-17 17:20:05, Waiman Long wrote: > >> int alloc_dlock_list_heads(struct dlock_list_heads *dlist) > >> { > >> - int idx; > >> + int idx, cnt =

Re: [PATCH v6 5/6] lib/dlock-list: Enable faster lookup with hashing

2017-10-05 Thread Jan Kara
in include/linux/list_bl.h. Sure it's a tradeoff between bitlock / spinlock but is there a user where it matters? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH v6 4/6] lib/dlock-list: Make sibling CPUs share the same linked list

2017-10-05 Thread Jan Kara
ock_lists is initialized? But how can the dlist be used later when it has larger number of lists and you don't know how many? Honza -- Jan Kara SUSE Labs, CR

Re: [PATCH 1/2] writeback: eliminate work item allocation in bd_start_writeback()

2017-10-04 Thread Jan Kara
ust for this purpose. > > After this change, we truly only ever have one of them running at > any point in time. We mark the need to start all flushes, and the > writeback thread will clear it once it has processed the request. > > Signed-off-by: Jens Axboe Just one nit below. You c

Re: [PATCH 2/2] sysctl: remove /proc/sys/vm/nr_pdflush_threads

2017-10-04 Thread Jan Kara
file ABI obsolete notice, and > the sysfs file. > > Signed-off-by: Jens Axboe Agreed. You can add: Reviewed-by: Jan Kara Honza > --- > Documentation/ABI/obsolete/proc-sys-vm-nr_pdflush_thr

Re: [PATCH 02/12] buffer: grow_dev_page() should use __GFP_NOFAIL for all cases

2017-10-03 Thread Jan Kara
On Tue 03-10-17 08:36:16, Jens Axboe wrote: > On 10/03/2017 06:25 AM, Jan Kara wrote: > > On Tue 03-10-17 14:10:49, Jan Kara wrote: > >> On Wed 27-09-17 14:13:49, Jens Axboe wrote: > >>> We currently it it for find_or_create_page(), which means that it > >&g

Re: [PATCH 03/12] buffer: eliminate the need to call free_more_memory() in __getblk_slow()

2017-10-03 Thread Jan Kara
o the last user of free_more_memory(), kill > it off completely. > > Signed-off-by: Jens Axboe Looks good. You can add: Reviewed-by: Jan Kara Honza > --- > fs/buffer.c | 23 --- > 1 file change

Re: [PATCH 05/12] writeback: switch wakeup_flusher_threads() to cyclic writeback

2017-10-03 Thread Jan Kara
y cleaning" writeback, I agree that range_cyclic probably makes more sense. You can add: Reviewed-by: Jan Kara Honza > --- > fs/fs-writeback.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > >

Re: [PATCH 02/12] buffer: grow_dev_page() should use __GFP_NOFAIL for all cases

2017-10-03 Thread Jan Kara
On Tue 03-10-17 14:10:49, Jan Kara wrote: > On Wed 27-09-17 14:13:49, Jens Axboe wrote: > > We currently it it for find_or_create_page(), which means that it > > cannot fail. Ensure we also pass in 'retry == true' to > > alloc_page_buffers(), which also ensure that

<    5   6   7   8   9   10   11   12   13   14   >